Removing files from data analysis directory

[ou-summer-of-code-2017.git] / 03-door-codes / door-codes-solution.ipynb
diff --git a/03-door-codes/door-codes-solution.ipynb b/03-door-codes/door-codes-solution.ipynb

index 8e67f00ca1da95cb16cd7812e9d3a0996dd70b82..f0ea3499828b6e4106fbb1d54ac20a90184adaac 100644 (file)
--- a/03-door-codes/door-codes-solution.ipynb
+++ b/03-door-codes/door-codes-solution.ipynb
@@ -43,6 +43,26 @@
      "**What is your door code?**"
     ]
    },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Worked example of solution: Part 1\n",
+    "\n",
+    "While the overall shape of this is the same as previous days (walk along a list, updating the code as you reach each letter), there are a couple of wrinkles:\n",
+    "\n",
+    "1. Not every character in the input should be processed (and the others should be converted to lower-case letters).\n",
+    "2. The 'update the code' part is complex.\n",
+    "\n",
+    "\"Sanitising\" the input is, again, walking over the input, convering letters and discarding the rest. These are examples of standard approaches: `filter` is applying a predicate to every item in a sequence, returning just hose that pass; `map` is applying a function to every item in a sequence, returning the sequence of results. In this case, sanitising the input is `filter`ing to keep just the letters then `map`ping over the \"convert to lowercase\" function. Python's comprehensions do this: the general form is `f(x) for x in sequence if predicate(x)`\n",
+    "\n",
+    "Updating the code involves lots of faffing around, converting between characters and numbers. Rather than retyping lots of arithmetic, I define a couple of functions to do the conversions how I want. I've deliberately given them short names, as I want the functions to almost disappear in the program, becoming little more than punctuation. That will keep the focus on the important part, the updating.\n",
+    "\n",
+    "The `ord(letter) - ord('a')` and `chr(number + ord('a')` are standard idioms for converting from letters to positions in the alphabet. There's also moving the result by 1 to give one-based numbering, and the modulus operation `%` to keep the numbers in the range 0-25 before converting back to letters.\n",
+    "\n",
+    "Finally, the `string` library defines some convenient constants, which helps prevent annoying and hard-to-find typos if I wrote out the alphabet verbatim here."
+   ]
+  },
    {
     "cell_type": "code",
     "execution_count": 2,
@@ -54,6 +74,38 @@
      "import string"
     ]
    },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def sanitise(phrase):\n",
+    "    return ''.join(l.lower() for l in phrase if l in string.ascii_letters)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'helloworld'"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "sanitise('Hello World')"
+   ]
+  },
    {
     "cell_type": "code",
     "execution_count": 3,
@@ -91,22 +143,29 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 23,
     "metadata": {
      "collapsed": true
     },
     "outputs": [],
     "source": [
-    "def sanitise(phrase):\n",
-    "    return ''.join(l for l in phrase if l in string.ascii_lowercase)"
+    "def whash1(word):\n",
+    "    h = list(word[:2])\n",
+    "    for l in word[2:]:\n",
+    "        h[0] = c(o(h[0]) + o(h[1]))\n",
+    "        h[1] = c(o(h[1]) + o(l))\n",
+    "    return ''.join(h)"
     ]
    },
    {
     "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
+   "execution_count": 20,
+   "metadata": {
+    "collapsed": true
+   },
     "outputs": [],
     "source": [
+    "# Extended version that generates the tables used in the question text.\n",
      "def whash1(word, show_steps=False):\n",
      "    if show_steps:\n",
      "        print('| old code | code as<br>numbers | passphrase<br/>letter | number of<br/>letter | new first<br/>part of code |'\n",
@@ -132,7 +191,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
     "metadata": {},
     "outputs": [
      {
@@ -153,7 +212,7 @@
         "'vk'"
        ]
       },
-     "execution_count": 8,
+     "execution_count": 9,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -164,7 +223,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 10,
     "metadata": {
      "collapsed": true
     },
@@ -176,7 +235,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 24,
     "metadata": {},
     "outputs": [
      {
@@ -185,7 +244,7 @@
         "'mc'"
        ]
       },
-     "execution_count": 10,
+     "execution_count": 24,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -213,7 +272,7 @@
      "\n",
      "\"Multiplying\" letters is done by converting the letters to their position in the alphabet (starting at one) and multiplying. For instance, to multiply `u` by 11, convert `u` to `21`, multiply by 11 (`21` × `11` = `231`), then convert back to a letter (`231` is larger than 26, so it becomes `23`, which is `w`).\n",
      "\n",
-    "Again, anything that isn't a lower-case letter is ignored.\n",
+    "Again, all letters are converted to lower-case and anything that isn't a letter is ignored.\n",
      "\n",
      "For example, to find the code from the pass phrase `the cat`, the code starts as being the first two letters `ri`. When the first letter is encrypted, the first letter of the code becomes:\n",
      "\n",
@@ -243,9 +302,18 @@
      "Using this new algorithm, **what is your door code?**"
     ]
    },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Worked example of solution: Part 2\n",
+    "\n",
+    "This is almost identical to part 1, but the arithmetic is slightly different. Note the use of keyword arguments with default values, to allow the code to use different starting values."
+   ]
+  },
    {
     "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 12,
     "metadata": {},
     "outputs": [
      {
@@ -254,7 +322,7 @@
         "(21, 231, 23, 'w')"
        ]
       },
-     "execution_count": 11,
+     "execution_count": 12,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -265,7 +333,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 13,
     "metadata": {},
     "outputs": [
      {
@@ -274,7 +342,7 @@
         "(18, 9, 45, 63, 'k')"
        ]
       },
-     "execution_count": 12,
+     "execution_count": 13,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -285,7 +353,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 14,
     "metadata": {},
     "outputs": [
      {
@@ -294,7 +362,7 @@
         "(9, 20, 220, 229, 'u')"
        ]
       },
-     "execution_count": 13,
+     "execution_count": 14,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -305,12 +373,32 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 21,
     "metadata": {
      "collapsed": true
     },
     "outputs": [],
     "source": [
+    "def whash2(word, h0=None, alpha=5, beta=11):\n",
+    "    if h0 is None:\n",
+    "        h = list('ri')\n",
+    "    else:\n",
+    "        h = list(h0)\n",
+    "    for l in word:\n",
+    "        h[0] = c(o(h[0]) + o(h[1]) * alpha)\n",
+    "        h[1] = c(o(h[1]) + o(l) * beta)\n",
+    "    return ''.join(h)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# Extended version that generates the tables used in the question text.\n",
      "def whash2(word, h0=None, alpha=5, beta=11, show_steps=False):\n",
      "    if show_steps:\n",
      "        print('| old code | code as<br>numbers | passphrase<br/>letter | number of<br/>letter | new first<br/>part of code |'\n",
@@ -339,7 +427,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 16,
     "metadata": {},
     "outputs": [
      {
@@ -362,7 +450,7 @@
         "'vl'"
        ]
       },
-     "execution_count": 15,
+     "execution_count": 16,
       "metadata": {},
       "output_type": "execute_result"
      }
@@ -373,7 +461,7 @@
    },
    {
     "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 22,
     "metadata": {},
     "outputs": [
      {
@@ -382,7 +470,7 @@
         "'qb'"
        ]
       },
-     "execution_count": 16,
+     "execution_count": 22,
       "metadata": {},
       "output_type": "execute_result"
      }