+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Worked example solution: part 1\n",
+ "This is the first of two tasks intended to exercise skills developed in M269. In this case, this is about search. I won't go over the general idea of search itself here, as there are loads of tutorials online about it. Instead, I'll talk about the specific way I've implmeneted it in this solution.\n",
+ "\n",
+ "(See below for the <a href=\"#part2\">discussion of part 2</a>.)\n",
+ "\n",
+ "## Search\n",
+ "The offices/words can be thought of as a graph of nodes and edges. Each word is a node, and there's an edge between two words if those words differ by one letter. The diagrams below show the connections between words that are one step and two steps away from 'rush'.\n",
+ "\n",
+ "| Words one step from 'rush' | Words two steps from 'rush' |\n",
+ "| ------------- | ------------- |\n",
+ "| <a href=\"rush1.dot.png\"><img src=\"rush1.dot.png\" alt=\"Words one step from 'rush'\" style=\"width: 200px;\"/></a> | <a href=\"rush2.dot.png\"><img src=\"rush2.dot.png\" alt=\"Words two steps from 'rush'\" style=\"width: 200px;\"/></a> |\n",
+ "\n",
+ "The task is to find a path, a sequence of words, from the starting word to the destination. This is a classic search problem.\n",
+ "\n",
+ "The key data structure is the _agenda_, a set of partial routes. We take a partial route from the agenda and _extend_ it with the words/rooms reachable from the end of the partial route, giving a new set of partial routes. Those are added back into the agenda.\n",
+ "\n",
+ "Search finishes either when the partial route we're processing is a solution (i.e. goes form the origin to destination room) or the agenda becomes empty.\n",
+ "\n",
+ "For instance, say we're going from `ache` to `ashy`. The initial agenda consists of just one partial route, and that partial route contains just `ache`.\n",
+ "\n",
+ "```\n",
+ "ache\n",
+ "```\n",
+ "\n",
+ "We take that item off the agenda, extend it (with `acne`, `acme`, `acre`, and `achy` as neighbours of `ache`). When we add those four items to the agenda, we get \n",
+ "\n",
+ "```\n",
+ "ache -> acne\n",
+ "ache -> acme\n",
+ "ache -> acre\n",
+ "ache -> achy\n",
+ "```\n",
+ "\n",
+ "We then proces the `ache -> acne` partial path, extending it with `acme` and `acre`, giving the agenda:\n",
+ "\n",
+ "```\n",
+ "ache -> acme\n",
+ "ache -> acre\n",
+ "ache -> achy\n",
+ "ache -> acne -> acme\n",
+ "ache -> acne -> acre\n",
+ "```\n",
+ "\n",
+ "(Note that while `ache` is adjacent to `acne`, we don't want to create a new partial path to `ache` as we've already visited it in this path.)\n",
+ "\n",
+ "We then proces the `ache -> acme` partial path, extending it with `acne` and `acre`, giving the agenda:\n",
+ "\n",
+ "```\n",
+ "ache -> acre\n",
+ "ache -> achy\n",
+ "ache -> acne -> acme\n",
+ "ache -> acne -> acre\n",
+ "ache -> acme -> acne\n",
+ "ache -> acme -> acre\n",
+ "```\n",
+ "\n",
+ "We then do `ache -> acre` and `ache -> achy` to give:\n",
+ "\n",
+ "```\n",
+ "ache -> acne -> acme\n",
+ "ache -> acne -> acre\n",
+ "ache -> acme -> acne\n",
+ "ache -> acme -> acre\n",
+ "ache -> acre -> acne\n",
+ "ache -> acre -> acme\n",
+ "ache -> achy -> ashy\n",
+ "```\n",
+ "\n",
+ "`ache -> acne -> acme` has only one valid extension, so we get:\n",
+ "\n",
+ "```\n",
+ "ache -> acne -> acre\n",
+ "ache -> acme -> acne\n",
+ "ache -> acme -> acre\n",
+ "ache -> acre -> acne\n",
+ "ache -> acre -> acme\n",
+ "ache -> achy -> ashy\n",
+ "ache -> acne -> acme -> acre\n",
+ "```\n",
+ "\n",
+ "We process all the other partial paths in turn until we get to the agenda looking like:\n",
+ "```\n",
+ "ache -> achy -> ashy\n",
+ "ache -> acne -> acme -> acre\n",
+ "ache -> acne -> acre -> acme\n",
+ "ache -> acme -> acne -> acre\n",
+ "ache -> acme -> acre -> acne\n",
+ "ache -> acre -> acne -> acme\n",
+ "ache -> acre -> acme -> acne\n",
+ "```\n",
+ "At this point, the first partial path in the agenda is a solution, so we report success and return that path.\n",
+ "\n",
+ "With breadth-first search, we add the newly-discovered partial paths to the end of the agenda, meaning we go through the graph one layer at a time. If we add the new partial paths to the front of the agenda, we have depth-first search, where we explore one trail fully before backtracking and trying another.\n",
+ "\n",
+ "There are other clever things we can do, such as sorting the agenda by predicted distance to go, which can speed things up.\n",
+ "\n",
+ "## Agenda and paths\n",
+ "As each partial path is a sequence of words in order, it makes sense to store it as a Python `list` of words. For breadth-first and depth-first searches, the agenda is a sequence of partial paths. Again, it makes sense to store this a a Python `list` of partial paths.\n",
+ "\n",
+ "For A* search, the agenda is sorted by the sum of path length and predicted cost to complete the path. In this case, it makes sense to represent the agenda by a priority queue, also known as a heap. I decided to use the standard Python `heapq` library, rather than the one used in M269. Partial paths are put on the heap with `heappush` and the partial path with the lowest cost is removed with `heappop`. \n",
+ "\n",
+ "## The graph of words\n",
+ "How to represent the graph of words? In the input, the connections between words are implicit, so we need to write a little procedure that returns all the neighbouring words for the word passed in. \n",
+ "\n",
+ "There's also a choice here: to we calculate and store the explicit graph of words, or do we rediscover the neighbours every time we want to process a word? The first approach consumes space to reduce time; the second uses time to reduce space. \n",
+ "\n",
+ "In this case, as there are only 2300 words and 20,000 connections, it's not a lot of space. The search will be examining lots of nodes, so I took a decision to explicity cache all the word neighbour relations first, then just look them up as needed. \n",
+ "\n",
+ "(It turns out this wasn't a good idea in terms of raw perfromance. Generating the graph takes ~130ms, but seems to give just about no change in performance for any search algorithm. Ho hum, but an example of where a [premature optimistion](https://en.wikiquote.org/wiki/Donald_Knuth) turns out to be costly!)\n",
+ "\n",
+ "## Important optimisation: closed sets\n",
+ "If you look at the diagrams and traces above, you'll see that the graph of words is very heavily connected, with several different routes from one word to another. But we often don't need to try out all these alternatives. If we're considering a partial route that ends at a particular word _w_, and we've already found another partial route to _w_ that's no longer than this one, there's no need to continue analysing this one. For instance, in the trace above, there's a route `ache -> acne -> acme`, but we've already found the route `ache -> acme`, so we can discard the `ache -> acne -> acme` alternative without worrying about missing possible solutions.\n",
+ "\n",
+ "If we use something like breadth-first search, we know that the first time we encounter a new word is the shortest path to that word. That means that all subsequent times we arrive that that word, we know we can discard that partial path. \n",
+ "\n",
+ "We maintain a `closed set` of the words we've already processed. We can use that set in two places. Either we check the partial path when we take it off the agenda, or we use the closed set to reduce the number of partial paths generated at each step. In the implementation below, I do the latter.\n",
+ "\n",
+ "## Another optimisation: `list` or `set` of words?\n",
+ "The obvious way to represent the known words is as a list. But, we don't care about the order of items in the collection of words. All we do with them is check for existence of a word, and iterate through the whole collection. In these cases, the Python `set` is a much more efficient data structure than the Pyhton `list`, as it uses something like a `dict` underneath. This means membership checking is much faster while iteration takes about the same amount of time.\n",
+ "\n",
+ "Therefore, rather than using `list`s, we'll use `set`s where possible. In fact, as the set of words doesn't change, we can use the `frozenset` type to indicate that the set is immutable."
+ ]
+ },