notebooks/zz-mongo/14.1.basic-crud.ipynb

   1 {
   2  "metadata": {
   3   "name": "",
   4   "signature": "sha256:ca7c66e6984a2f06ea667e7728d0714cbb52770e149e09ca0b4b93ce7ea12ed0"
   5  },
   6  "nbformat": 3,
   7  "nbformat_minor": 0,
   8  "worksheets": [
   9   {
  10    "cells": [
  11     {
  12      "cell_type": "markdown",
  13      "metadata": {},
  14      "source": [
  15       "# CRUD In MongoDB\n",
  16       "\n",
  17       "This notebook will take you through a few basic operations with a dummy database, just to see how the basic CRUD (Create, Read, Update, Delete) operations work.\n",
  18       "\n",
  19       "We're using the [PyMongo](http://api.mongodb.org/python/current/) module to allow Python to connect to MongoDB. The notebooks in the module will describe most of the features of PyMongo you need, but you should refer to the [API documentation](http://api.mongodb.org/python/current/api/index.html) as necessary to understand the detail and nuance of PyMongo. PyMongo is also a fairly thin wrapper on MongoDB, so you may need to refer to the [MongoDB reference](http://docs.mongodb.org/manual/reference/) for some of the details and *MongoDB: The Definitive Guide* for context and background."
  20      ]
  21     },
  22     {
  23      "cell_type": "code",
  24      "collapsed": false,
  25      "input": [
  26       "# Import the required libraries\n",
  27       "\n",
  28       "import pymongo\n",
  29       "import bson\n",
  30       "from bson.objectid import ObjectId"
  31      ],
  32      "language": "python",
  33      "metadata": {
  34       "activity": false
  35      },
  36      "outputs": [],
  37      "prompt_number": 31
  38     },
  39     {
  40      "cell_type": "code",
  41      "collapsed": false,
  42      "input": [
  43       "# Open a connection to the Mongo server\n",
  44       "client = pymongo.MongoClient('mongodb://localhost:27017/')"
  45      ],
  46      "language": "python",
  47      "metadata": {
  48       "activity": false
  49      },
  50      "outputs": [],
  51      "prompt_number": 32
  52     },
  53     {
  54      "cell_type": "code",
  55      "collapsed": false,
  56      "input": [
  57       "# Create the crud-test database and a test_collection within it.\n",
  58       "test_db = client.crud_test\n",
  59       "tc = test_db.test_collection"
  60      ],
  61      "language": "python",
  62      "metadata": {
  63       "activity": false
  64      },
  65      "outputs": [],
  66      "prompt_number": 33
  67     },
  68     {
  69      "cell_type": "markdown",
  70      "metadata": {},
  71      "source": [
  72       "Note that database and collection creation in MongoDB is *lazy*: the database and collection aren't actually created in the DBMS until the first document is written."
  73      ]
  74     },
  75     {
  76      "cell_type": "markdown",
  77      "metadata": {},
  78      "source": [
  79       "#Data structures and conversion\n",
  80       "PyMongo handles automatically most of the translation between Python data structures and the JSON structures that Mongo uses. This table summaries the main equivalences.\n",
  81       "\n",
  82       "| Document DB term | JSON structure | Python structure |\n",
  83       "|------------------|----------------|------------------|\n",
  84       "| Document or sub-document | Object | dict  |\n",
  85       "| List | Array | list |\n",
  86       "| Key | String | string |\n",
  87       "| String | String | string |\n",
  88       "| Number | Number | int or float, depending |\n",
  89       "| Date | Date | datetime.datetime object |\n",
  90       "| Object IDs | BSON ObjectId | BSON ObjectId |\n",
  91       "\n",
  92       "MongoDB uses BSON, a binary version of JSON, internally. You can generally ignore this, except when you want to create new ObjectIds for documents."
  93      ]
  94     },
  95     {
  96      "cell_type": "markdown",
  97      "metadata": {},
  98      "source": [
  99       "#Create\n",
 100       "Let's insert a few simple documents, just to get started.\n",
 101       "\n",
 102       "Note that keys in a document have to be strings, but the values can be almost anything."
 103      ]
 104     },
 105     {
 106      "cell_type": "code",
 107      "collapsed": false,
 108      "input": [
 109       "# Insert a single document\n",
 110       "tc.insert({'name': 'William', 'birthyear': 1908})\n",
 111       "\n",
 112       "# Insert a few\n",
 113       "for n, b in zip('Patrick Jon Tom Peter Colin Sylvester Paul Christopher David Matt Peter'.split(),\n",
 114       "                [1920, 1919, 1934, 1951, 1943, 1943, 1959, 1964, 1971, 1982, 1958]):\n",
 115       "    tc.insert({'name': n, 'birthyear': b})"
 116      ],
 117      "language": "python",
 118      "metadata": {
 119       "activity": false
 120      },
 121      "outputs": [],
 122      "prompt_number": 34
 123     },
 124     {
 125      "cell_type": "markdown",
 126      "metadata": {},
 127      "source": [
 128       "#Read\n",
 129       "`find_one()` will return a single (arbitrary) document. Note that Mongo automatically adds an `_id` field to each document. You can override this if you really want to, but we won't bother."
 130      ]
 131     },
 132     {
 133      "cell_type": "code",
 134      "collapsed": false,
 135      "input": [
 136       "tc.find_one()"
 137      ],
 138      "language": "python",
 139      "metadata": {
 140       "activity": false
 141      },
 142      "outputs": [
 143       {
 144        "metadata": {},
 145        "output_type": "pyout",
 146        "prompt_number": 9,
 147        "text": [
 148         "{'_id': ObjectId('54dce849f203c6076e4cda67'),\n",
 149         " 'birthyear': 1908,\n",
 150         " 'name': 'William'}"
 151        ]
 152       }
 153      ],
 154      "prompt_number": 9
 155     },
 156     {
 157      "cell_type": "markdown",
 158      "metadata": {},
 159      "source": [
 160       "The `pymongo` library does the type conversion for us:"
 161      ]
 162     },
 163     {
 164      "cell_type": "code",
 165      "collapsed": false,
 166      "input": [
 167       "type(tc.find_one())"
 168      ],
 169      "language": "python",
 170      "metadata": {},
 171      "outputs": [
 172       {
 173        "metadata": {},
 174        "output_type": "pyout",
 175        "prompt_number": 38,
 176        "text": [
 177         "dict"
 178        ]
 179       }
 180      ],
 181      "prompt_number": 38
 182     },
 183     {
 184      "cell_type": "markdown",
 185      "metadata": {},
 186      "source": [
 187       "If we give a dict of some key-value pairs, `find_one()` will return a document that matches them."
 188      ]
 189     },
 190     {
 191      "cell_type": "code",
 192      "collapsed": false,
 193      "input": [
 194       "tc.find_one({'name': 'Peter'})"
 195      ],
 196      "language": "python",
 197      "metadata": {
 198       "activity": false
 199      },
 200      "outputs": [
 201       {
 202        "metadata": {},
 203        "output_type": "pyout",
 204        "prompt_number": 10,
 205        "text": [
 206         "{'_id': ObjectId('54dce84af203c6076e4cda6b'),\n",
 207         " 'birthyear': 1951,\n",
 208         " 'name': 'Peter'}"
 209        ]
 210       }
 211      ],
 212      "prompt_number": 10
 213     },
 214     {
 215      "cell_type": "code",
 216      "collapsed": false,
 217      "input": [
 218       "tc.find_one({'birthyear': 1943})"
 219      ],
 220      "language": "python",
 221      "metadata": {
 222       "activity": false
 223      },
 224      "outputs": [
 225       {
 226        "metadata": {},
 227        "output_type": "pyout",
 228        "prompt_number": 11,
 229        "text": [
 230         "{'_id': ObjectId('54dce84af203c6076e4cda6c'),\n",
 231         " 'birthyear': 1943,\n",
 232         " 'name': 'Colin'}"
 233        ]
 234       }
 235      ],
 236      "prompt_number": 11
 237     },
 238     {
 239      "cell_type": "markdown",
 240      "metadata": {},
 241      "source": [
 242       "`find()` will find all the documents that match the query, and returns a cursor that can be iterated over to retrieve the documents one by one."
 243      ]
 244     },
 245     {
 246      "cell_type": "code",
 247      "collapsed": false,
 248      "input": [
 249       "tc.find({'name': 'Peter'})"
 250      ],
 251      "language": "python",
 252      "metadata": {
 253       "activity": false
 254      },
 255      "outputs": [
 256       {
 257        "metadata": {},
 258        "output_type": "pyout",
 259        "prompt_number": 12,
 260        "text": [
 261         "<pymongo.cursor.Cursor at 0xb09f042c>"
 262        ]
 263       }
 264      ],
 265      "prompt_number": 12
 266     },
 267     {
 268      "cell_type": "code",
 269      "collapsed": false,
 270      "input": [
 271       "for p in tc.find({'name': 'Peter'}):\n",
 272       "    print(p)"
 273      ],
 274      "language": "python",
 275      "metadata": {
 276       "activity": false
 277      },
 278      "outputs": [
 279       {
 280        "output_type": "stream",
 281        "stream": "stdout",
 282        "text": [
 283         "{'_id': ObjectId('54ddea37f203c6076e4cda83'), 'birthyear': 1951, 'name': 'Peter'}\n",
 284         "{'_id': ObjectId('54ddea37f203c6076e4cda8a'), 'birthyear': 1958, 'name': 'Peter'}\n"
 285        ]
 286       }
 287      ],
 288      "prompt_number": 37
 289     },
 290     {
 291      "cell_type": "code",
 292      "collapsed": false,
 293      "input": [
 294       "max(person['birthyear'] for person in tc.find())"
 295      ],
 296      "language": "python",
 297      "metadata": {},
 298      "outputs": [
 299       {
 300        "metadata": {},
 301        "output_type": "pyout",
 302        "prompt_number": 35,
 303        "text": [
 304         "1982"
 305        ]
 306       }
 307      ],
 308      "prompt_number": 35
 309     },
 310     {
 311      "cell_type": "markdown",
 312      "metadata": {},
 313      "source": [
 314       "The cursor can also tell us how many documents will match the query."
 315      ]
 316     },
 317     {
 318      "cell_type": "code",
 319      "collapsed": false,
 320      "input": [
 321       "tc.find({'name': 'Peter'}).count()"
 322      ],
 323      "language": "python",
 324      "metadata": {
 325       "activity": false
 326      },
 327      "outputs": [
 328       {
 329        "metadata": {},
 330        "output_type": "pyout",
 331        "prompt_number": 14,
 332        "text": [
 333         "2"
 334        ]
 335       }
 336      ],
 337      "prompt_number": 14
 338     },
 339     {
 340      "cell_type": "markdown",
 341      "metadata": {},
 342      "source": [
 343       "An optional second argument to `find()` specifies the key-value pairs to return. If you give a list of keys, `find()` will return just those plus to `_id`. "
 344      ]
 345     },
 346     {
 347      "cell_type": "code",
 348      "collapsed": false,
 349      "input": [
 350       "for p in tc.find({'name': 'Peter'}, ['birthyear']):\n",
 351       "    print(p)"
 352      ],
 353      "language": "python",
 354      "metadata": {
 355       "activity": false
 356      },
 357      "outputs": [
 358       {
 359        "output_type": "stream",
 360        "stream": "stdout",
 361        "text": [
 362         "{'_id': ObjectId('54ddea37f203c6076e4cda83'), 'birthyear': 1951}\n",
 363         "{'_id': ObjectId('54ddea37f203c6076e4cda8a'), 'birthyear': 1958}\n"
 364        ]
 365       }
 366      ],
 367      "prompt_number": 36
 368     },
 369     {
 370      "cell_type": "markdown",
 371      "metadata": {},
 372      "source": [
 373       "<div class=\"activity\">Activity</div>\n",
 374       "How many people were born in 1943? What are their names?"
 375      ]
 376     },
 377     {
 378      "cell_type": "code",
 379      "collapsed": false,
 380      "input": [
 381       "# How many people were born in 1943?"
 382      ],
 383      "language": "python",
 384      "metadata": {},
 385      "outputs": []
 386     },
 387     {
 388      "cell_type": "code",
 389      "collapsed": false,
 390      "input": [
 391       "# What are the names of people born in 1943?"
 392      ],
 393      "language": "python",
 394      "metadata": {},
 395      "outputs": []
 396     },
 397     {
 398      "cell_type": "markdown",
 399      "metadata": {},
 400      "source": [
 401       "How many people were born in 1943?\n",
 402       "<div class=\"answer\" id=\"count1943\" style=\"display: none\">\n",
 403       "blah\n",
 404       "`tc.find({'birthyear': 1943}).count()`</div>\n",
 405       "\n",
 406       "What are the names of people born in 1943?\n",
 407       "<div class=\"answer\" id=\"name1943\" style=\"display: none\">\n",
 408       "blah\n",
 409       "`[person['name'] for person in tc.find({'birthyear': 1943})]`</div>"
 410      ]
 411     },
 412     {
 413      "cell_type": "markdown",
 414      "metadata": {},
 415      "source": [
 416       "#Update\n",
 417       "The basic `update()` take two arguments: a specification of the document to update (in the same way as `find()`) and a document it's updated to.\n",
 418       "\n",
 419       "There are a couple of complications to this, however. One is that, by default, the new document completely replaces the old one. Let's say we want to add a surname to one of the records:"
 420      ]
 421     },
 422     {
 423      "cell_type": "code",
 424      "collapsed": false,
 425      "input": [
 426       "patrick = tc.find_one({'name': 'Patrick'})\n",
 427       "print(patrick)"
 428      ],
 429      "language": "python",
 430      "metadata": {
 431       "activity": false
 432      },
 433      "outputs": [
 434       {
 435        "output_type": "stream",
 436        "stream": "stdout",
 437        "text": [
 438         "{'_id': ObjectId('54dce84af203c6076e4cda68'), 'birthyear': 1920, 'name': 'Patrick'}\n"
 439        ]
 440       }
 441      ],
 442      "prompt_number": 15
 443     },
 444     {
 445      "cell_type": "code",
 446      "collapsed": false,
 447      "input": [
 448       "tc.update({'name': 'Patrick'}, {'surname': 'Troughton'})"
 449      ],
 450      "language": "python",
 451      "metadata": {
 452       "activity": false
 453      },
 454      "outputs": [
 455       {
 456        "metadata": {},
 457        "output_type": "pyout",
 458        "prompt_number": 16,
 459        "text": [
 460         "{'connectionId': 9, 'updatedExisting': True, 'n': 1, 'err': None, 'ok': 1.0}"
 461        ]
 462       }
 463      ],
 464      "prompt_number": 16
 465     },
 466     {
 467      "cell_type": "markdown",
 468      "metadata": {},
 469      "source": [
 470       "(One document updated, no errors.)\n",
 471       "\n",
 472       "If we now look for the updated document, we get a surprise:"
 473      ]
 474     },
 475     {
 476      "cell_type": "code",
 477      "collapsed": false,
 478      "input": [
 479       "for p in tc.find({'name': 'Patrick'}):\n",
 480       "    print(p)"
 481      ],
 482      "language": "python",
 483      "metadata": {
 484       "activity": false
 485      },
 486      "outputs": [],
 487      "prompt_number": 17
 488     },
 489     {
 490      "cell_type": "markdown",
 491      "metadata": {},
 492      "source": [
 493       "Nothing found! If we look for the document by ID:"
 494      ]
 495     },
 496     {
 497      "cell_type": "code",
 498      "collapsed": false,
 499      "input": [
 500       "tc.find_one({'_id': patrick['_id']})"
 501      ],
 502      "language": "python",
 503      "metadata": {
 504       "activity": false
 505      },
 506      "outputs": [
 507       {
 508        "metadata": {},
 509        "output_type": "pyout",
 510        "prompt_number": 18,
 511        "text": [
 512         "{'surname': 'Troughton', '_id': ObjectId('54dce84af203c6076e4cda68')}"
 513        ]
 514       }
 515      ],
 516      "prompt_number": 18
 517     },
 518     {
 519      "cell_type": "markdown",
 520      "metadata": {},
 521      "source": [
 522       "...everything's gone, replaced by the 'surname'. If we want to add some more keys (or change existing ones), the update specification should contain some *update modifiers*, of which the most common are '\\$set' and '\\$unset'. These respectively say to set new values for keys, or discard the keys (and their associated values).\n",
 523       "\n",
 524       "To put back the name and birthyear, we'd use this update:"
 525      ]
 526     },
 527     {
 528      "cell_type": "code",
 529      "collapsed": false,
 530      "input": [
 531       "tc.update({'surname': 'Troughton'}, # specify the document to update\n",
 532       "          {'$set':                  # specifiy the keys to update\n",
 533       "           {'name': 'Patrick', 'birthyear': 1920} })"
 534      ],
 535      "language": "python",
 536      "metadata": {
 537       "activity": false
 538      },
 539      "outputs": [
 540       {
 541        "metadata": {},
 542        "output_type": "pyout",
 543        "prompt_number": 19,
 544        "text": [
 545         "{'connectionId': 9, 'updatedExisting': True, 'n': 1, 'err': None, 'ok': 1.0}"
 546        ]
 547       }
 548      ],
 549      "prompt_number": 19
 550     },
 551     {
 552      "cell_type": "code",
 553      "collapsed": false,
 554      "input": [
 555       "for p in tc.find({'name': 'Patrick'}):\n",
 556       "    print(p)"
 557      ],
 558      "language": "python",
 559      "metadata": {
 560       "activity": false
 561      },
 562      "outputs": [
 563       {
 564        "output_type": "stream",
 565        "stream": "stdout",
 566        "text": [
 567         "{'surname': 'Troughton', '_id': ObjectId('54dce84af203c6076e4cda68'), 'birthyear': 1920, 'name': 'Patrick'}\n"
 568        ]
 569       }
 570      ],
 571      "prompt_number": 20
 572     },
 573     {
 574      "cell_type": "markdown",
 575      "metadata": {},
 576      "source": [
 577       "The other complication is that if multiple documents match the query document, Mongo will update an arbitrary one of them. Generally, this isn't useful. To update every document that matches the query, use the 'multi' keyword parameter:"
 578      ]
 579     },
 580     {
 581      "cell_type": "code",
 582      "collapsed": false,
 583      "input": [
 584       "tc.update({'name': 'Peter'}, {'$set': {'multi_updated': True}}, multi=True)"
 585      ],
 586      "language": "python",
 587      "metadata": {
 588       "activity": false
 589      },
 590      "outputs": [
 591       {
 592        "metadata": {},
 593        "output_type": "pyout",
 594        "prompt_number": 21,
 595        "text": [
 596         "{'connectionId': 9, 'updatedExisting': True, 'n': 2, 'err': None, 'ok': 1.0}"
 597        ]
 598       }
 599      ],
 600      "prompt_number": 21
 601     },
 602     {
 603      "cell_type": "code",
 604      "collapsed": false,
 605      "input": [
 606       "for p in tc.find():\n",
 607       "    print(p)"
 608      ],
 609      "language": "python",
 610      "metadata": {
 611       "activity": false
 612      },
 613      "outputs": [
 614       {
 615        "output_type": "stream",
 616        "stream": "stdout",
 617        "text": [
 618         "{'_id': ObjectId('54dce849f203c6076e4cda67'), 'birthyear': 1908, 'name': 'William'}\n",
 619         "{'_id': ObjectId('54dce84af203c6076e4cda69'), 'birthyear': 1919, 'name': 'Jon'}\n",
 620         "{'_id': ObjectId('54dce84af203c6076e4cda6a'), 'birthyear': 1934, 'name': 'Tom'}\n",
 621         "{'_id': ObjectId('54dce84af203c6076e4cda6c'), 'birthyear': 1943, 'name': 'Colin'}\n",
 622         "{'_id': ObjectId('54dce84af203c6076e4cda6d'), 'birthyear': 1943, 'name': 'Sylvester'}\n",
 623         "{'_id': ObjectId('54dce84af203c6076e4cda6e'), 'birthyear': 1959, 'name': 'Paul'}\n",
 624         "{'_id': ObjectId('54dce84af203c6076e4cda6f'), 'birthyear': 1964, 'name': 'Christopher'}\n",
 625         "{'_id': ObjectId('54dce84af203c6076e4cda70'), 'birthyear': 1971, 'name': 'David'}\n",
 626         "{'_id': ObjectId('54dce84af203c6076e4cda71'), 'birthyear': 1982, 'name': 'Matt'}\n",
 627         "{'surname': 'Troughton', '_id': ObjectId('54dce84af203c6076e4cda68'), 'birthyear': 1920, 'name': 'Patrick'}\n",
 628         "{'name': 'Peter', '_id': ObjectId('54dce84af203c6076e4cda6b'), 'birthyear': 1951, 'multi_updated': True}\n",
 629         "{'name': 'Peter', '_id': ObjectId('54dce84af203c6076e4cda72'), 'birthyear': 1958, 'multi_updated': True}\n"
 630        ]
 631       }
 632      ],
 633      "prompt_number": 22
 634     },
 635     {
 636      "cell_type": "markdown",
 637      "metadata": {},
 638      "source": [
 639       "You can see that the two Peters were updated. \n",
 640       "\n",
 641       "We can remove the additional key with the '\\$unset' modifier (the value we're updating it to is ignored):"
 642      ]
 643     },
 644     {
 645      "cell_type": "code",
 646      "collapsed": false,
 647      "input": [
 648       "tc.update({'name': 'Peter'}, {'$unset': {'multi_updated': ''}}, multi=True)\n",
 649       "for p in tc.find():\n",
 650       "    print(p)"
 651      ],
 652      "language": "python",
 653      "metadata": {
 654       "activity": false
 655      },
 656      "outputs": [
 657       {
 658        "output_type": "stream",
 659        "stream": "stdout",
 660        "text": [
 661         "{'_id': ObjectId('54dce849f203c6076e4cda67'), 'birthyear': 1908, 'name': 'William'}\n",
 662         "{'_id': ObjectId('54dce84af203c6076e4cda69'), 'birthyear': 1919, 'name': 'Jon'}\n",
 663         "{'_id': ObjectId('54dce84af203c6076e4cda6a'), 'birthyear': 1934, 'name': 'Tom'}\n",
 664         "{'_id': ObjectId('54dce84af203c6076e4cda6c'), 'birthyear': 1943, 'name': 'Colin'}\n",
 665         "{'_id': ObjectId('54dce84af203c6076e4cda6d'), 'birthyear': 1943, 'name': 'Sylvester'}\n",
 666         "{'_id': ObjectId('54dce84af203c6076e4cda6e'), 'birthyear': 1959, 'name': 'Paul'}\n",
 667         "{'_id': ObjectId('54dce84af203c6076e4cda6f'), 'birthyear': 1964, 'name': 'Christopher'}\n",
 668         "{'_id': ObjectId('54dce84af203c6076e4cda70'), 'birthyear': 1971, 'name': 'David'}\n",
 669         "{'_id': ObjectId('54dce84af203c6076e4cda71'), 'birthyear': 1982, 'name': 'Matt'}\n",
 670         "{'surname': 'Troughton', '_id': ObjectId('54dce84af203c6076e4cda68'), 'birthyear': 1920, 'name': 'Patrick'}\n",
 671         "{'_id': ObjectId('54dce84af203c6076e4cda6b'), 'birthyear': 1951, 'name': 'Peter'}\n",
 672         "{'_id': ObjectId('54dce84af203c6076e4cda72'), 'birthyear': 1958, 'name': 'Peter'}\n"
 673        ]
 674       }
 675      ],
 676      "prompt_number": 23
 677     },
 678     {
 679      "cell_type": "markdown",
 680      "metadata": {},
 681      "source": [
 682       "The 'multi' approach can only give the same value to each matching documents. If we want to give a different value to each document, we have to specify each document in turn in the update. This is effecient if we use the document's \\_id, as that's indexed:"
 683      ]
 684     },
 685     {
 686      "cell_type": "code",
 687      "collapsed": false,
 688      "input": [
 689       "for p in tc.find():\n",
 690       "    tc.update({'_id': p['_id']}, {'$set': {'age': 2015 - p['birthyear']}})\n",
 691       "for p in tc.find():\n",
 692       "    print(p)"
 693      ],
 694      "language": "python",
 695      "metadata": {
 696       "activity": false
 697      },
 698      "outputs": [
 699       {
 700        "output_type": "stream",
 701        "stream": "stdout",
 702        "text": [
 703         "{'age': 64, '_id': ObjectId('54dce84af203c6076e4cda6b'), 'birthyear': 1951, 'name': 'Peter'}\n",
 704         "{'age': 57, '_id': ObjectId('54dce84af203c6076e4cda72'), 'birthyear': 1958, 'name': 'Peter'}\n",
 705         "{'age': 107, '_id': ObjectId('54dce849f203c6076e4cda67'), 'birthyear': 1908, 'name': 'William'}\n",
 706         "{'age': 96, '_id': ObjectId('54dce84af203c6076e4cda69'), 'birthyear': 1919, 'name': 'Jon'}\n",
 707         "{'age': 81, '_id': ObjectId('54dce84af203c6076e4cda6a'), 'birthyear': 1934, 'name': 'Tom'}\n",
 708         "{'age': 72, '_id': ObjectId('54dce84af203c6076e4cda6c'), 'birthyear': 1943, 'name': 'Colin'}\n",
 709         "{'age': 72, '_id': ObjectId('54dce84af203c6076e4cda6d'), 'birthyear': 1943, 'name': 'Sylvester'}\n",
 710         "{'age': 56, '_id': ObjectId('54dce84af203c6076e4cda6e'), 'birthyear': 1959, 'name': 'Paul'}\n",
 711         "{'age': 51, '_id': ObjectId('54dce84af203c6076e4cda6f'), 'birthyear': 1964, 'name': 'Christopher'}\n",
 712         "{'age': 44, '_id': ObjectId('54dce84af203c6076e4cda70'), 'birthyear': 1971, 'name': 'David'}\n",
 713         "{'age': 33, '_id': ObjectId('54dce84af203c6076e4cda71'), 'birthyear': 1982, 'name': 'Matt'}\n",
 714         "{'surname': 'Troughton', 'age': 95, '_id': ObjectId('54dce84af203c6076e4cda68'), 'birthyear': 1920, 'name': 'Patrick'}\n"
 715        ]
 716       }
 717      ],
 718      "prompt_number": 25
 719     },
 720     {
 721      "cell_type": "markdown",
 722      "metadata": {},
 723      "source": [
 724       "<div class=\"activity\">Activity</div>\n",
 725       "Classify the people into two groups. Those born in 1945 or earlier should be labelled as `'age': 'old'`, while the others should be labelled as `'age': 'young'`."
 726      ]
 727     },
 728     {
 729      "cell_type": "markdown",
 730      "metadata": {},
 731      "source": [
 732       "Update people born in 1945 or earlier\n",
 733       "<div class=\"answer\" id=\"updateOld\" style=\"display: none\">\n",
 734       "`tc.update({'birthyear': {'$lte': 1945}}, {'$set': {'age': 'old'}}, multi=True)`</div>\n",
 735       "\n",
 736       "Update people born after 1945\n",
 737       "<div class=\"answer\" id=\"updateYoung\" style=\"display: none\">\n",
 738       "`tc.update({'birthyear': {'$gt': 1945}}, {'$set': {'age': 'young'}}, multi=True)`</div>"
 739      ]
 740     },
 741     {
 742      "cell_type": "markdown",
 743      "metadata": {},
 744      "source": [
 745       "# Embedded documents\n",
 746       "\n",
 747       "Values in documents can be themselves documents. For instance, we can add encapsulate each person's name in a subdocument."
 748      ]
 749     },
 750     {
 751      "cell_type": "code",
 752      "collapsed": false,
 753      "input": [
 754       "tc.drop()\n",
 755       "# Insert a few\n",
 756       "for f, s, b in zip('William Patrick Jon Tom Peter Colin Sylvester Paul Christopher David Matt Peter'.split(),\n",
 757       "                   'Hartnell Troughton Pertwee Baker Davison Baker McCoy McGann Eccleston Tennant Smith Capaldi'.split(),\n",
 758       "                [1908, 1920, 1919, 1934, 1951, 1943, 1943, 1959, 1964, 1971, 1982, 1958]):\n",
 759       "    tc.insert({'name': {'forename': f, 'surname': s}, 'birthyear': b})\n",
 760       "for p in tc.find():\n",
 761       "    print(p)"
 762      ],
 763      "language": "python",
 764      "metadata": {
 765       "activity": false
 766      },
 767      "outputs": [
 768       {
 769        "output_type": "stream",
 770        "stream": "stdout",
 771        "text": [
 772         "{'_id': ObjectId('54dce8aef203c6076e4cda73'), 'birthyear': 1908, 'name': {'surname': 'Hartnell', 'forename': 'William'}}\n",
 773         "{'_id': ObjectId('54dce8aef203c6076e4cda74'), 'birthyear': 1920, 'name': {'surname': 'Troughton', 'forename': 'Patrick'}}\n",
 774         "{'_id': ObjectId('54dce8aef203c6076e4cda75'), 'birthyear': 1919, 'name': {'surname': 'Pertwee', 'forename': 'Jon'}}\n",
 775         "{'_id': ObjectId('54dce8aef203c6076e4cda76'), 'birthyear': 1934, 'name': {'surname': 'Baker', 'forename': 'Tom'}}\n",
 776         "{'_id': ObjectId('54dce8aef203c6076e4cda77'), 'birthyear': 1951, 'name': {'surname': 'Davison', 'forename': 'Peter'}}\n",
 777         "{'_id': ObjectId('54dce8aef203c6076e4cda78'), 'birthyear': 1943, 'name': {'surname': 'Baker', 'forename': 'Colin'}}\n",
 778         "{'_id': ObjectId('54dce8aef203c6076e4cda79'), 'birthyear': 1943, 'name': {'surname': 'McCoy', 'forename': 'Sylvester'}}\n",
 779         "{'_id': ObjectId('54dce8aef203c6076e4cda7a'), 'birthyear': 1959, 'name': {'surname': 'McGann', 'forename': 'Paul'}}\n",
 780         "{'_id': ObjectId('54dce8aef203c6076e4cda7b'), 'birthyear': 1964, 'name': {'surname': 'Eccleston', 'forename': 'Christopher'}}\n",
 781         "{'_id': ObjectId('54dce8aef203c6076e4cda7c'), 'birthyear': 1971, 'name': {'surname': 'Tennant', 'forename': 'David'}}\n",
 782         "{'_id': ObjectId('54dce8aef203c6076e4cda7d'), 'birthyear': 1982, 'name': {'surname': 'Smith', 'forename': 'Matt'}}\n",
 783         "{'_id': ObjectId('54dce8aef203c6076e4cda7e'), 'birthyear': 1958, 'name': {'surname': 'Capaldi', 'forename': 'Peter'}}\n"
 784        ]
 785       }
 786      ],
 787      "prompt_number": 26
 788     },
 789     {
 790      "cell_type": "markdown",
 791      "metadata": {},
 792      "source": [
 793       "We can also include a list of notable stories for each person. Note the use of the dot notation to identify keys in a subdocument."
 794      ]
 795     },
 796     {
 797      "cell_type": "code",
 798      "collapsed": false,
 799      "input": [
 800       "tc.update({'name.forename': 'William', 'name.surname': 'Hartnell'},\n",
 801       "        {'$set': {'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet']}})"
 802      ],
 803      "language": "python",
 804      "metadata": {
 805       "activity": false
 806      },
 807      "outputs": [
 808       {
 809        "metadata": {},
 810        "output_type": "pyout",
 811        "prompt_number": 27,
 812        "text": [
 813         "{'connectionId': 9, 'updatedExisting': True, 'n': 1, 'err': None, 'ok': 1.0}"
 814        ]
 815       }
 816      ],
 817      "prompt_number": 27
 818     },
 819     {
 820      "cell_type": "code",
 821      "collapsed": false,
 822      "input": [
 823       "tc.find_one({'name.forename': 'William'})"
 824      ],
 825      "language": "python",
 826      "metadata": {
 827       "activity": false
 828      },
 829      "outputs": [
 830       {
 831        "metadata": {},
 832        "output_type": "pyout",
 833        "prompt_number": 28,
 834        "text": [
 835         "{'name': {'surname': 'Hartnell', 'forename': 'William'},\n",
 836         " '_id': ObjectId('54dce8aef203c6076e4cda73'),\n",
 837         " 'birthyear': 1908,\n",
 838         " 'episodes': ['An Unearthly Child', 'The Daleks', 'The Tenth Planet']}"
 839        ]
 840       }
 841      ],
 842      "prompt_number": 28
 843     },
 844     {
 845      "cell_type": "markdown",
 846      "metadata": {},
 847      "source": [
 848       "There's lots more information on this in the *MongoDB: The Definitive Guide* book and the [MongoDB documentation](http://docs.mongodb.org/manual/reference/)."
 849      ]
 850     },
 851     {
 852      "cell_type": "markdown",
 853      "metadata": {},
 854      "source": [
 855       "# Clean up\n",
 856       "Drop this test database"
 857      ]
 858     },
 859     {
 860      "cell_type": "code",
 861      "collapsed": false,
 862      "input": [
 863       "# Drop the test collection\n",
 864       "tc.drop()"
 865      ],
 866      "language": "python",
 867      "metadata": {
 868       "activity": false
 869      },
 870      "outputs": [],
 871      "prompt_number": 29
 872     },
 873     {
 874      "cell_type": "code",
 875      "collapsed": false,
 876      "input": [],
 877      "language": "python",
 878      "metadata": {
 879       "activity": false
 880      },
 881      "outputs": []
 882     }
 883    ],
 884    "metadata": {}
 885   }
 886  ]
 887 }