Imported all the notebooks
[tm351-notebooks.git] / notebooks / zz-mongo / 14.1.basic-crud.ipynb
1 {
2 "metadata": {
3 "name": "",
4 "signature": "sha256:1e5423dc5f125ff3a4e0c162f9302e0604b944e34e2da3f2ae6cf45de20c54bd"
5 },
6 "nbformat": 3,
7 "nbformat_minor": 0,
8 "worksheets": [
9 {
10 "cells": [
11 {
12 "cell_type": "markdown",
13 "metadata": {},
14 "source": [
15 "# CRUD In MongoDB\n",
16 "\n",
17 "This notebook will take you through a few basic operations with a dummy database, just to see how the basic CRUD (Create, Read, Update, Delete) operations work.\n",
18 "\n",
19 "We're using the [PyMongo](http://api.mongodb.org/python/current/) module to allow Python to connect to MongoDB. The notebooks in the module will describe most of the features of PyMongo you need, but you should refer to the [API documentation](http://api.mongodb.org/python/current/api/index.html) as necessary to understand the detail and nuance of PyMongo. PyMongo is also a fairly thin wrapper on MongoDB, so you may need to refer to the [MongoDB reference](http://docs.mongodb.org/manual/reference/) for some of the details."
20 ]
21 },
22 {
23 "cell_type": "code",
24 "collapsed": false,
25 "input": [
26 "# Import the required libraries\n",
27 "\n",
28 "import pymongo\n",
29 "import bson\n",
30 "from bson.objectid import ObjectId\n",
31 "import datetime\n",
32 "import collections\n",
33 "import matplotlib as mpl\n",
34 "import matplotlib.pyplot as plt\n",
35 "import numpy as np\n",
36 "%matplotlib inline"
37 ],
38 "language": "python",
39 "metadata": {},
40 "outputs": [],
41 "prompt_number": 1
42 },
43 {
44 "cell_type": "code",
45 "collapsed": false,
46 "input": [
47 "# Open a connection to the Mongo server\n",
48 "client = pymongo.MongoClient('mongodb://localhost:27017/')"
49 ],
50 "language": "python",
51 "metadata": {},
52 "outputs": [],
53 "prompt_number": 2
54 },
55 {
56 "cell_type": "code",
57 "collapsed": false,
58 "input": [
59 "# Create the crud-test database and a test_collection within it.\n",
60 "test_db = client.crud_test\n",
61 "tc = test_db.test_collection"
62 ],
63 "language": "python",
64 "metadata": {},
65 "outputs": [],
66 "prompt_number": 3
67 },
68 {
69 "cell_type": "markdown",
70 "metadata": {},
71 "source": [
72 "Note that database and collection creation in MongoDB is *lazy*: the database and collection aren't actually created in the DBMS until the first document is written."
73 ]
74 },
75 {
76 "cell_type": "markdown",
77 "metadata": {},
78 "source": [
79 "#Data structures and conversion\n",
80 "PyMongo handles automatically most of the translation between Python data structures and the JSON structures that Mongo uses. This table summaries the main equivalences.\n",
81 "\n",
82 "| Document DB term | JSON structure | Python structure |\n",
83 "|------------------|----------------|------------------|\n",
84 "| Document or sub-document | Object | dict |\n",
85 "| List | Array | list |\n",
86 "| Key | String | string |\n",
87 "| String | String | string |\n",
88 "| Number | Number | int or float, depending |\n",
89 "| Date | Date | datetime.datetime object |\n",
90 "| Object IDs | BSON ObjectId | BSON ObjectId |\n",
91 "\n",
92 "MongoDB uses BSON, a binary version of JSON, internally. You can generally ignore this, except when you want to create new ObjectIds for documents."
93 ]
94 },
95 {
96 "cell_type": "markdown",
97 "metadata": {},
98 "source": [
99 "#Create\n",
100 "Let's insert a few simple documents, just to get started.\n",
101 "\n",
102 "Note that keys in a document have to be strings, but the values can be almost anything."
103 ]
104 },
105 {
106 "cell_type": "code",
107 "collapsed": false,
108 "input": [
109 "# Insert a single document\n",
110 "tc.insert({'name': 'William', 'birthyear': 1908})\n",
111 "\n",
112 "# Insert a few\n",
113 "for n, b in zip('Patrick Jon Tom Peter Colin Sylvester Paul Christopher David Matt Peter'.split(),\n",
114 " [1920, 1919, 1934, 1951, 1943, 1943, 1959, 1964, 1971, 1982, 1958]):\n",
115 " tc.insert({'name': n, 'birthyear': b})"
116 ],
117 "language": "python",
118 "metadata": {},
119 "outputs": [],
120 "prompt_number": 4
121 },
122 {
123 "cell_type": "markdown",
124 "metadata": {},
125 "source": [
126 "#Read\n",
127 "`find_one()` will return a single (arbitrary) document. Note that Mongo automatically adds an `_id` field to each document. You can override this if you really want to, but we won't bother."
128 ]
129 },
130 {
131 "cell_type": "code",
132 "collapsed": false,
133 "input": [
134 "tc.find_one()"
135 ],
136 "language": "python",
137 "metadata": {},
138 "outputs": [
139 {
140 "metadata": {},
141 "output_type": "pyout",
142 "prompt_number": 5,
143 "text": [
144 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a8'),\n",
145 " 'birthyear': 1920,\n",
146 " 'name': {'forename': 'Patrick', 'surname': 'Troughton'}}"
147 ]
148 }
149 ],
150 "prompt_number": 5
151 },
152 {
153 "cell_type": "markdown",
154 "metadata": {},
155 "source": [
156 "If we give a dict of some key-value pairs, `find_one()` will return a document that matches them."
157 ]
158 },
159 {
160 "cell_type": "code",
161 "collapsed": false,
162 "input": [
163 "tc.find_one({'name': 'Peter'})"
164 ],
165 "language": "python",
166 "metadata": {},
167 "outputs": [
168 {
169 "metadata": {},
170 "output_type": "pyout",
171 "prompt_number": 6,
172 "text": [
173 "{'_id': ObjectId('53a9a569dbc241153afa3c42'),\n",
174 " 'birthyear': 1951,\n",
175 " 'name': 'Peter'}"
176 ]
177 }
178 ],
179 "prompt_number": 6
180 },
181 {
182 "cell_type": "code",
183 "collapsed": false,
184 "input": [
185 "tc.find_one({'birthyear': 1943})"
186 ],
187 "language": "python",
188 "metadata": {},
189 "outputs": [
190 {
191 "metadata": {},
192 "output_type": "pyout",
193 "prompt_number": 7,
194 "text": [
195 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ac'),\n",
196 " 'birthyear': 1943,\n",
197 " 'name': {'forename': 'Colin', 'surname': 'Baker'}}"
198 ]
199 }
200 ],
201 "prompt_number": 7
202 },
203 {
204 "cell_type": "markdown",
205 "metadata": {},
206 "source": [
207 "`find()` will find all the documents that match the query, and returns a cursor that can be iterated over to retrieve the documents one by one."
208 ]
209 },
210 {
211 "cell_type": "code",
212 "collapsed": false,
213 "input": [
214 "tc.find({'name': 'Peter'})"
215 ],
216 "language": "python",
217 "metadata": {},
218 "outputs": [
219 {
220 "metadata": {},
221 "output_type": "pyout",
222 "prompt_number": 8,
223 "text": [
224 "<pymongo.cursor.Cursor at 0x7f8540d23fd0>"
225 ]
226 }
227 ],
228 "prompt_number": 8
229 },
230 {
231 "cell_type": "code",
232 "collapsed": false,
233 "input": [
234 "for p in tc.find({'name': 'Peter'}):\n",
235 " print(p)"
236 ],
237 "language": "python",
238 "metadata": {},
239 "outputs": [
240 {
241 "output_type": "stream",
242 "stream": "stdout",
243 "text": [
244 "{'_id': ObjectId('53a9a569dbc241153afa3c42'), 'birthyear': 1951, 'name': 'Peter'}\n",
245 "{'_id': ObjectId('53a9a569dbc241153afa3c49'), 'birthyear': 1958, 'name': 'Peter'}\n"
246 ]
247 }
248 ],
249 "prompt_number": 9
250 },
251 {
252 "cell_type": "markdown",
253 "metadata": {},
254 "source": [
255 "The cursor can also tell us how many documents will match the query."
256 ]
257 },
258 {
259 "cell_type": "code",
260 "collapsed": false,
261 "input": [
262 "tc.find({'name': 'Peter'}).count()"
263 ],
264 "language": "python",
265 "metadata": {},
266 "outputs": [
267 {
268 "metadata": {},
269 "output_type": "pyout",
270 "prompt_number": 10,
271 "text": [
272 "2"
273 ]
274 }
275 ],
276 "prompt_number": 10
277 },
278 {
279 "cell_type": "markdown",
280 "metadata": {},
281 "source": [
282 "An optional second argument to `find()` specifies the key-value pairs to return. If you give a list of keys, `find()` will return just those plus to `_id`. "
283 ]
284 },
285 {
286 "cell_type": "markdown",
287 "metadata": {},
288 "source": [
289 "#Update\n",
290 "Updating is a bit more complicated. The basic `update()` take two arguments: a specification of the document to update (in the same way as `find()`) and a document it's updated to.\n",
291 "\n",
292 "There are a couple of complications to this, however. One is that, by default, the new document completely replaces the old one. Let's say we want to add a surname to one of the records:"
293 ]
294 },
295 {
296 "cell_type": "code",
297 "collapsed": false,
298 "input": [
299 "patrick = tc.find_one({'name': 'Patrick'})\n",
300 "print(patrick)"
301 ],
302 "language": "python",
303 "metadata": {},
304 "outputs": [
305 {
306 "output_type": "stream",
307 "stream": "stdout",
308 "text": [
309 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'birthyear': 1920, 'name': 'Patrick'}\n"
310 ]
311 }
312 ],
313 "prompt_number": 11
314 },
315 {
316 "cell_type": "code",
317 "collapsed": false,
318 "input": [
319 "tc.update({'name': 'Patrick'}, {'surname': 'Troughton'})"
320 ],
321 "language": "python",
322 "metadata": {},
323 "outputs": [
324 {
325 "metadata": {},
326 "output_type": "pyout",
327 "prompt_number": 12,
328 "text": [
329 "{'connectionId': 3,\n",
330 " 'err': None,\n",
331 " 'n': 1,\n",
332 " 'ok': 1.0,\n",
333 " 'syncMillis': 0,\n",
334 " 'updatedExisting': True,\n",
335 " 'writtenTo': None}"
336 ]
337 }
338 ],
339 "prompt_number": 12
340 },
341 {
342 "cell_type": "markdown",
343 "metadata": {},
344 "source": [
345 "(One document updated, no errors.)\n",
346 "\n",
347 "If we now look for the updated document, we get a surprise:"
348 ]
349 },
350 {
351 "cell_type": "code",
352 "collapsed": false,
353 "input": [
354 "for p in tc.find({'name': 'Patrick'}):\n",
355 " print(p)"
356 ],
357 "language": "python",
358 "metadata": {},
359 "outputs": [],
360 "prompt_number": 13
361 },
362 {
363 "cell_type": "markdown",
364 "metadata": {},
365 "source": [
366 "Nothing found! If we look for the document by ID:"
367 ]
368 },
369 {
370 "cell_type": "code",
371 "collapsed": false,
372 "input": [
373 "tc.find_one({'_id': patrick['_id']})"
374 ],
375 "language": "python",
376 "metadata": {},
377 "outputs": [
378 {
379 "metadata": {},
380 "output_type": "pyout",
381 "prompt_number": 14,
382 "text": [
383 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'surname': 'Troughton'}"
384 ]
385 }
386 ],
387 "prompt_number": 14
388 },
389 {
390 "cell_type": "markdown",
391 "metadata": {},
392 "source": [
393 "...everything's gone, replaced by the 'surname'. If we want to add some more keys (or change existing ones), the update specification should contain some *update modifiers*, of which the most common are '\\$set' and '\\$unset'. These respectively say to set new values for keys, or discard the keys (and their associated values).\n",
394 "\n",
395 "To put back the name and birthyear, we'd use this update:"
396 ]
397 },
398 {
399 "cell_type": "code",
400 "collapsed": false,
401 "input": [
402 "tc.update({'surname': 'Troughton'}, # specify the document to update\n",
403 " {'$set': # specifiy the keys to update\n",
404 " {'name': 'Patrick', 'birthyear': 1920} })"
405 ],
406 "language": "python",
407 "metadata": {},
408 "outputs": [
409 {
410 "metadata": {},
411 "output_type": "pyout",
412 "prompt_number": 15,
413 "text": [
414 "{'connectionId': 3,\n",
415 " 'err': None,\n",
416 " 'n': 1,\n",
417 " 'ok': 1.0,\n",
418 " 'syncMillis': 0,\n",
419 " 'updatedExisting': True,\n",
420 " 'writtenTo': None}"
421 ]
422 }
423 ],
424 "prompt_number": 15
425 },
426 {
427 "cell_type": "code",
428 "collapsed": false,
429 "input": [
430 "for p in tc.find({'name': 'Patrick'}):\n",
431 " print(p)"
432 ],
433 "language": "python",
434 "metadata": {},
435 "outputs": [
436 {
437 "output_type": "stream",
438 "stream": "stdout",
439 "text": [
440 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'surname': 'Troughton', 'birthyear': 1920, 'name': 'Patrick'}\n"
441 ]
442 }
443 ],
444 "prompt_number": 16
445 },
446 {
447 "cell_type": "markdown",
448 "metadata": {},
449 "source": [
450 "The other complication is that if multiple documents match the query document, Mongo will update an arbitrary one of them. Generally, this isn't useful. To update every document that matches the query, use the 'multi' keyword parameter:"
451 ]
452 },
453 {
454 "cell_type": "code",
455 "collapsed": false,
456 "input": [
457 "tc.update({'name': 'Peter'}, {'$set': {'multi_updated': True}}, multi=True)"
458 ],
459 "language": "python",
460 "metadata": {},
461 "outputs": [
462 {
463 "metadata": {},
464 "output_type": "pyout",
465 "prompt_number": 17,
466 "text": [
467 "{'connectionId': 3,\n",
468 " 'err': None,\n",
469 " 'n': 2,\n",
470 " 'ok': 1.0,\n",
471 " 'syncMillis': 0,\n",
472 " 'updatedExisting': True,\n",
473 " 'writtenTo': None}"
474 ]
475 }
476 ],
477 "prompt_number": 17
478 },
479 {
480 "cell_type": "code",
481 "collapsed": false,
482 "input": [
483 "for p in tc.find():\n",
484 " print(p)"
485 ],
486 "language": "python",
487 "metadata": {},
488 "outputs": [
489 {
490 "output_type": "stream",
491 "stream": "stdout",
492 "text": [
493 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a8'), 'birthyear': 1920, 'name': {'surname': 'Troughton', 'forename': 'Patrick'}}\n",
494 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a9'), 'birthyear': 1919, 'name': {'surname': 'Pertwee', 'forename': 'Jon'}}\n",
495 "{'_id': ObjectId('53a9a25cdbc24112a8ed33aa'), 'birthyear': 1934, 'name': {'surname': 'Baker', 'forename': 'Tom'}}\n",
496 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ab'), 'birthyear': 1951, 'name': {'surname': 'Davison', 'forename': 'Peter'}}\n",
497 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ac'), 'birthyear': 1943, 'name': {'surname': 'Baker', 'forename': 'Colin'}}\n",
498 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ad'), 'birthyear': 1943, 'name': {'surname': 'McCoy', 'forename': 'Sylvester'}}\n",
499 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ae'), 'birthyear': 1959, 'name': {'surname': 'McGann', 'forename': 'Paul'}}\n",
500 "{'_id': ObjectId('53a9a25cdbc24112a8ed33af'), 'birthyear': 1964, 'name': {'surname': 'Eccleston', 'forename': 'Christopher'}}\n",
501 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b0'), 'birthyear': 1971, 'name': {'surname': 'Tennant', 'forename': 'David'}}\n",
502 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b1'), 'birthyear': 1982, 'name': {'surname': 'Smith', 'forename': 'Matt'}}\n",
503 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b2'), 'birthyear': 1958, 'name': {'surname': 'Capaldi', 'forename': 'Peter'}}\n",
504 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a7'), 'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet'], 'birthyear': 1908, 'name': {'surname': 'Hartnell', 'forename': 'William'}}\n",
505 "{'_id': ObjectId('53a9a569dbc241153afa3c3e'), 'birthyear': 1908, 'name': 'William'}\n",
506 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'surname': 'Troughton', 'birthyear': 1920, 'name': 'Patrick'}\n",
507 "{'_id': ObjectId('53a9a569dbc241153afa3c40'), 'birthyear': 1919, 'name': 'Jon'}\n",
508 "{'_id': ObjectId('53a9a569dbc241153afa3c41'), 'birthyear': 1934, 'name': 'Tom'}\n",
509 "{'_id': ObjectId('53a9a569dbc241153afa3c42'), 'multi_updated': True, 'birthyear': 1951, 'name': 'Peter'}\n",
510 "{'_id': ObjectId('53a9a569dbc241153afa3c43'), 'birthyear': 1943, 'name': 'Colin'}\n",
511 "{'_id': ObjectId('53a9a569dbc241153afa3c44'), 'birthyear': 1943, 'name': 'Sylvester'}\n",
512 "{'_id': ObjectId('53a9a569dbc241153afa3c45'), 'birthyear': 1959, 'name': 'Paul'}\n",
513 "{'_id': ObjectId('53a9a569dbc241153afa3c46'), 'birthyear': 1964, 'name': 'Christopher'}\n",
514 "{'_id': ObjectId('53a9a569dbc241153afa3c47'), 'birthyear': 1971, 'name': 'David'}\n",
515 "{'_id': ObjectId('53a9a569dbc241153afa3c48'), 'birthyear': 1982, 'name': 'Matt'}\n",
516 "{'_id': ObjectId('53a9a569dbc241153afa3c49'), 'multi_updated': True, 'birthyear': 1958, 'name': 'Peter'}\n"
517 ]
518 }
519 ],
520 "prompt_number": 18
521 },
522 {
523 "cell_type": "markdown",
524 "metadata": {},
525 "source": [
526 "You can see that the two Peters were updated. \n",
527 "\n",
528 "We can remove the additional key with the '\\$unset' modifier (the value we're updating it to is ignored):"
529 ]
530 },
531 {
532 "cell_type": "code",
533 "collapsed": false,
534 "input": [
535 "tc.update({'name': 'Peter'}, {'$unset': {'multi_updated': ''}}, multi=True)\n",
536 "for p in tc.find():\n",
537 " print(p)"
538 ],
539 "language": "python",
540 "metadata": {},
541 "outputs": [
542 {
543 "output_type": "stream",
544 "stream": "stdout",
545 "text": [
546 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a8'), 'birthyear': 1920, 'name': {'surname': 'Troughton', 'forename': 'Patrick'}}\n",
547 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a9'), 'birthyear': 1919, 'name': {'surname': 'Pertwee', 'forename': 'Jon'}}\n",
548 "{'_id': ObjectId('53a9a25cdbc24112a8ed33aa'), 'birthyear': 1934, 'name': {'surname': 'Baker', 'forename': 'Tom'}}\n",
549 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ab'), 'birthyear': 1951, 'name': {'surname': 'Davison', 'forename': 'Peter'}}\n",
550 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ac'), 'birthyear': 1943, 'name': {'surname': 'Baker', 'forename': 'Colin'}}\n",
551 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ad'), 'birthyear': 1943, 'name': {'surname': 'McCoy', 'forename': 'Sylvester'}}\n",
552 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ae'), 'birthyear': 1959, 'name': {'surname': 'McGann', 'forename': 'Paul'}}\n",
553 "{'_id': ObjectId('53a9a25cdbc24112a8ed33af'), 'birthyear': 1964, 'name': {'surname': 'Eccleston', 'forename': 'Christopher'}}\n",
554 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b0'), 'birthyear': 1971, 'name': {'surname': 'Tennant', 'forename': 'David'}}\n",
555 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b1'), 'birthyear': 1982, 'name': {'surname': 'Smith', 'forename': 'Matt'}}\n",
556 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b2'), 'birthyear': 1958, 'name': {'surname': 'Capaldi', 'forename': 'Peter'}}\n",
557 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a7'), 'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet'], 'birthyear': 1908, 'name': {'surname': 'Hartnell', 'forename': 'William'}}\n",
558 "{'_id': ObjectId('53a9a569dbc241153afa3c3e'), 'birthyear': 1908, 'name': 'William'}\n",
559 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'surname': 'Troughton', 'birthyear': 1920, 'name': 'Patrick'}\n",
560 "{'_id': ObjectId('53a9a569dbc241153afa3c40'), 'birthyear': 1919, 'name': 'Jon'}\n",
561 "{'_id': ObjectId('53a9a569dbc241153afa3c41'), 'birthyear': 1934, 'name': 'Tom'}\n",
562 "{'_id': ObjectId('53a9a569dbc241153afa3c42'), 'birthyear': 1951, 'name': 'Peter'}\n",
563 "{'_id': ObjectId('53a9a569dbc241153afa3c43'), 'birthyear': 1943, 'name': 'Colin'}\n",
564 "{'_id': ObjectId('53a9a569dbc241153afa3c44'), 'birthyear': 1943, 'name': 'Sylvester'}\n",
565 "{'_id': ObjectId('53a9a569dbc241153afa3c45'), 'birthyear': 1959, 'name': 'Paul'}\n",
566 "{'_id': ObjectId('53a9a569dbc241153afa3c46'), 'birthyear': 1964, 'name': 'Christopher'}\n",
567 "{'_id': ObjectId('53a9a569dbc241153afa3c47'), 'birthyear': 1971, 'name': 'David'}\n",
568 "{'_id': ObjectId('53a9a569dbc241153afa3c48'), 'birthyear': 1982, 'name': 'Matt'}\n",
569 "{'_id': ObjectId('53a9a569dbc241153afa3c49'), 'birthyear': 1958, 'name': 'Peter'}\n"
570 ]
571 }
572 ],
573 "prompt_number": 19
574 },
575 {
576 "cell_type": "markdown",
577 "metadata": {},
578 "source": [
579 "The 'multi' approach can only give the same value to each matching documents. If we want to give a different value to each document, we have to specify each document in turn in the update. This is effecient if we use the document's \\_id, as that's indexed:"
580 ]
581 },
582 {
583 "cell_type": "code",
584 "collapsed": false,
585 "input": [
586 "for p in tc.find():\n",
587 " tc.update({'_id': p['_id']}, {'$set': {'age': 2014 - p['birthyear']}})\n",
588 "for p in tc.find():\n",
589 " print(p)"
590 ],
591 "language": "python",
592 "metadata": {},
593 "outputs": [
594 {
595 "output_type": "stream",
596 "stream": "stdout",
597 "text": [
598 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a8'), 'age': 94, 'birthyear': 1920, 'name': {'surname': 'Troughton', 'forename': 'Patrick'}}\n",
599 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a9'), 'age': 95, 'birthyear': 1919, 'name': {'surname': 'Pertwee', 'forename': 'Jon'}}\n",
600 "{'_id': ObjectId('53a9a25cdbc24112a8ed33aa'), 'age': 80, 'birthyear': 1934, 'name': {'surname': 'Baker', 'forename': 'Tom'}}\n",
601 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ab'), 'age': 63, 'birthyear': 1951, 'name': {'surname': 'Davison', 'forename': 'Peter'}}\n",
602 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ac'), 'age': 71, 'birthyear': 1943, 'name': {'surname': 'Baker', 'forename': 'Colin'}}\n",
603 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ad'), 'age': 71, 'birthyear': 1943, 'name': {'surname': 'McCoy', 'forename': 'Sylvester'}}\n",
604 "{'_id': ObjectId('53a9a25cdbc24112a8ed33ae'), 'age': 55, 'birthyear': 1959, 'name': {'surname': 'McGann', 'forename': 'Paul'}}\n",
605 "{'_id': ObjectId('53a9a25cdbc24112a8ed33af'), 'age': 50, 'birthyear': 1964, 'name': {'surname': 'Eccleston', 'forename': 'Christopher'}}\n",
606 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b0'), 'age': 43, 'birthyear': 1971, 'name': {'surname': 'Tennant', 'forename': 'David'}}\n",
607 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b1'), 'age': 32, 'birthyear': 1982, 'name': {'surname': 'Smith', 'forename': 'Matt'}}\n",
608 "{'_id': ObjectId('53a9a25cdbc24112a8ed33b2'), 'age': 56, 'birthyear': 1958, 'name': {'surname': 'Capaldi', 'forename': 'Peter'}}\n",
609 "{'_id': ObjectId('53a9a25cdbc24112a8ed33a7'), 'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet'], 'age': 106, 'birthyear': 1908, 'name': {'surname': 'Hartnell', 'forename': 'William'}}\n",
610 "{'_id': ObjectId('53a9a569dbc241153afa3c3e'), 'age': 106, 'birthyear': 1908, 'name': 'William'}\n",
611 "{'_id': ObjectId('53a9a569dbc241153afa3c3f'), 'surname': 'Troughton', 'age': 94, 'birthyear': 1920, 'name': 'Patrick'}\n",
612 "{'_id': ObjectId('53a9a569dbc241153afa3c40'), 'age': 95, 'birthyear': 1919, 'name': 'Jon'}\n",
613 "{'_id': ObjectId('53a9a569dbc241153afa3c41'), 'age': 80, 'birthyear': 1934, 'name': 'Tom'}\n",
614 "{'_id': ObjectId('53a9a569dbc241153afa3c42'), 'age': 63, 'birthyear': 1951, 'name': 'Peter'}\n",
615 "{'_id': ObjectId('53a9a569dbc241153afa3c43'), 'age': 71, 'birthyear': 1943, 'name': 'Colin'}\n",
616 "{'_id': ObjectId('53a9a569dbc241153afa3c44'), 'age': 71, 'birthyear': 1943, 'name': 'Sylvester'}\n",
617 "{'_id': ObjectId('53a9a569dbc241153afa3c45'), 'age': 55, 'birthyear': 1959, 'name': 'Paul'}\n",
618 "{'_id': ObjectId('53a9a569dbc241153afa3c46'), 'age': 50, 'birthyear': 1964, 'name': 'Christopher'}\n",
619 "{'_id': ObjectId('53a9a569dbc241153afa3c47'), 'age': 43, 'birthyear': 1971, 'name': 'David'}\n",
620 "{'_id': ObjectId('53a9a569dbc241153afa3c48'), 'age': 32, 'birthyear': 1982, 'name': 'Matt'}\n",
621 "{'_id': ObjectId('53a9a569dbc241153afa3c49'), 'age': 56, 'birthyear': 1958, 'name': 'Peter'}\n"
622 ]
623 }
624 ],
625 "prompt_number": 20
626 },
627 {
628 "cell_type": "markdown",
629 "metadata": {},
630 "source": [
631 "# Embedded documents\n",
632 "\n",
633 "Values in documents can be themselves documents. For instance, we can add encapsulate each person's name in a subdocument."
634 ]
635 },
636 {
637 "cell_type": "code",
638 "collapsed": false,
639 "input": [
640 "tc.drop()\n",
641 "# Insert a few\n",
642 "for f, s, b in zip('William Patrick Jon Tom Peter Colin Sylvester Paul Christopher David Matt Peter'.split(),\n",
643 " 'Hartnell Troughton Pertwee Baker Davison Baker McCoy McGann Eccleston Tennant Smith Capaldi'.split(),\n",
644 " [1908, 1920, 1919, 1934, 1951, 1943, 1943, 1959, 1964, 1971, 1982, 1958]):\n",
645 " tc.insert({'name': {'forename': f, 'surname': s}, 'birthyear': b})\n",
646 "for p in tc.find():\n",
647 " print(p)"
648 ],
649 "language": "python",
650 "metadata": {},
651 "outputs": [
652 {
653 "output_type": "stream",
654 "stream": "stdout",
655 "text": [
656 "{'_id': ObjectId('53a9a569dbc241153afa3c4a'), 'birthyear': 1908, 'name': {'surname': 'Hartnell', 'forename': 'William'}}\n",
657 "{'_id': ObjectId('53a9a569dbc241153afa3c4b'), 'birthyear': 1920, 'name': {'surname': 'Troughton', 'forename': 'Patrick'}}\n",
658 "{'_id': ObjectId('53a9a569dbc241153afa3c4c'), 'birthyear': 1919, 'name': {'surname': 'Pertwee', 'forename': 'Jon'}}\n",
659 "{'_id': ObjectId('53a9a569dbc241153afa3c4d'), 'birthyear': 1934, 'name': {'surname': 'Baker', 'forename': 'Tom'}}\n",
660 "{'_id': ObjectId('53a9a569dbc241153afa3c4e'), 'birthyear': 1951, 'name': {'surname': 'Davison', 'forename': 'Peter'}}\n",
661 "{'_id': ObjectId('53a9a569dbc241153afa3c4f'), 'birthyear': 1943, 'name': {'surname': 'Baker', 'forename': 'Colin'}}\n",
662 "{'_id': ObjectId('53a9a569dbc241153afa3c50'), 'birthyear': 1943, 'name': {'surname': 'McCoy', 'forename': 'Sylvester'}}\n",
663 "{'_id': ObjectId('53a9a569dbc241153afa3c51'), 'birthyear': 1959, 'name': {'surname': 'McGann', 'forename': 'Paul'}}\n",
664 "{'_id': ObjectId('53a9a569dbc241153afa3c52'), 'birthyear': 1964, 'name': {'surname': 'Eccleston', 'forename': 'Christopher'}}\n",
665 "{'_id': ObjectId('53a9a569dbc241153afa3c53'), 'birthyear': 1971, 'name': {'surname': 'Tennant', 'forename': 'David'}}\n",
666 "{'_id': ObjectId('53a9a569dbc241153afa3c54'), 'birthyear': 1982, 'name': {'surname': 'Smith', 'forename': 'Matt'}}\n",
667 "{'_id': ObjectId('53a9a569dbc241153afa3c55'), 'birthyear': 1958, 'name': {'surname': 'Capaldi', 'forename': 'Peter'}}\n"
668 ]
669 }
670 ],
671 "prompt_number": 21
672 },
673 {
674 "cell_type": "markdown",
675 "metadata": {},
676 "source": [
677 "We can also include a list of notable stories for each person. Note the use of the dot notation to identify keys in a subdocument."
678 ]
679 },
680 {
681 "cell_type": "code",
682 "collapsed": false,
683 "input": [
684 "tc.update({'name.forename': 'William', 'name.surname': 'Hartnell'},\n",
685 " {'$set': {'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet']}})"
686 ],
687 "language": "python",
688 "metadata": {},
689 "outputs": [
690 {
691 "metadata": {},
692 "output_type": "pyout",
693 "prompt_number": 25,
694 "text": [
695 "{'connectionId': 3,\n",
696 " 'err': None,\n",
697 " 'n': 1,\n",
698 " 'ok': 1.0,\n",
699 " 'syncMillis': 0,\n",
700 " 'updatedExisting': True,\n",
701 " 'writtenTo': None}"
702 ]
703 }
704 ],
705 "prompt_number": 25
706 },
707 {
708 "cell_type": "code",
709 "collapsed": false,
710 "input": [
711 "tc.find_one({'name.forename': 'William'})"
712 ],
713 "language": "python",
714 "metadata": {},
715 "outputs": [
716 {
717 "metadata": {},
718 "output_type": "pyout",
719 "prompt_number": 26,
720 "text": [
721 "{'_id': ObjectId('53a9a569dbc241153afa3c4a'),\n",
722 " 'birthyear': 1908,\n",
723 " 'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet'],\n",
724 " 'name': {'forename': 'William', 'surname': 'Hartnell'}}"
725 ]
726 }
727 ],
728 "prompt_number": 26
729 },
730 {
731 "cell_type": "markdown",
732 "metadata": {},
733 "source": [
734 "There's lots more information on this in the *MongoDB: The Definitive Guide* book and the [MongoDB documentation](http://docs.mongodb.org/manual/reference/)."
735 ]
736 },
737 {
738 "cell_type": "markdown",
739 "metadata": {},
740 "source": [
741 "# Clean up\n",
742 "Drop this test database"
743 ]
744 },
745 {
746 "cell_type": "code",
747 "collapsed": false,
748 "input": [
749 "# Drop the test collection\n",
750 "tc.drop()"
751 ],
752 "language": "python",
753 "metadata": {},
754 "outputs": [],
755 "prompt_number": 27
756 },
757 {
758 "cell_type": "code",
759 "collapsed": false,
760 "input": [],
761 "language": "python",
762 "metadata": {},
763 "outputs": []
764 }
765 ],
766 "metadata": {}
767 }
768 ]
769 }