{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First nights in hotels are always a bit of an anticlimax, what with the recovery from travel and all. You decided to do one of your wordsearch puzzles.\n",
    "\n",
    "These puzzles are a bit different from normal because they have a puzzle grid and a list of words, but only some of the words are in the puzzle; some of the words given are decoys and aren't present.\n",
    "\n",
    "For instance, given the grid:\n",
    "\n",
    "```\n",
    "fhjpamownq\n",
    "wsdseuqeev\n",
    "ieaeladhcr\n",
    "piedtiriah\n",
    "rzsiwovspu\n",
    "oawhakieab\n",
    "browpcfrda\n",
    "ecnotopssr\n",
    "dikchdnpnb\n",
    "bstapleokr\n",
    "```\n",
    "and the list of words:\n",
    "\n",
    "* adapting, apace, bombing, cackles, carnal, chump, coccyxes, cowhides, crazies, crumbled, dock, fickler, foaming, guts, knows, lived, minuend, molested, mown, pears, probed, rhubarb, rioted, shinier, solaria, staple, tops, wide, winced\n",
    "\n",
    "you can find these words:\n",
    "\n",
    "* apace, cowhides, crazies, dock, knows, lived, mown, pears, probed, rhubarb, rioted, staple, tops, wide\n",
    "\n",
    "but these are the decoys:\n",
    "\n",
    "* adapting, bombing, cackles, carnal, chump, coccyxes, crumbled, fickler, foaming, guts, minuend, molested, shinier, solaria, winced\n",
    "\n",
    "For this puzzle, there are 14 words with a total length of 76 letters. (Some of the words may overlap in the grid, but don't worry about that when counting word lengths in your solution.)\n",
    "\n",
    "## About wordsearches\n",
    "\n",
    "Words can go in any of the eight directions (up, down, left, right, and diagonals) in a straight line. A letter in the grid can be in more than one word. Words don't wrap around the edges of the grid.\n",
    "\n",
    "In the example above, the words \"lived\", \"wide\" and \"staple\" are in these positions (two words are diagonal and share a letter).\n",
    "\n",
    "```\n",
    "..........\n",
    ".......e..\n",
    "....l.d...\n",
    ".....i....\n",
    "....w.v...\n",
    ".......e..\n",
    "........d.\n",
    "..........\n",
    "..........\n",
    ".staple...\n",
    "```\n",
    "\n",
    "The longest word, \"cowhides\", runs vertically upwards:\n",
    "\n",
    "```\n",
    "..........\n",
    "...s......\n",
    "...e......\n",
    "...d......\n",
    "...i......\n",
    "...h......\n",
    "...w......\n",
    "...o......\n",
    "...c......\n",
    "..........\n",
    "```\n",
    "\n",
    "If there are words present in the grid that aren't in the list of words given, ignore them. For instance, you can see the word \"brow\" running left to right on the seventh row of the grid, but that doesn't count as a word in this puzzle because \"brow\" isn't in the list of words you're given.\n",
    "\n",
    "You're safe to assume that each word in the puzzle is present either zero or one times, never more.\n",
    "\n",
    "## File format\n",
    "The wordsearch puzzle is given as a text file. The first line of the file is WxH, where W and H are the width and height of the puzzle grid, in characters. The next H lines are the grid, each line being W characters long. Finally, there's an arbitrary number of words to look for, one on each line.\n",
    "\n",
    "Ignore any trailing or leading blank lines, and any whitespace on a line.\n",
    "\n",
    "The example puzzle above, a ten by ten grid, would be written to a file as:\n",
    "\n",
    "```\n",
    "10x10\n",
    "fhjpamownq\n",
    "wsdseuqeev\n",
    "ieaeladhcr\n",
    "piedtiriah\n",
    "rzsiwovspu\n",
    "oawhakieab\n",
    "browpcfrda\n",
    "ecnotopssr\n",
    "dikchdnpnb\n",
    "bstapleokr\n",
    "adapting\n",
    "apace\n",
    "bombing\n",
    "cackles\n",
    "carnal\n",
    "chump\n",
    "coccyxes\n",
    "cowhides\n",
    "crazies\n",
    "crumbled\n",
    "dock\n",
    "fickler\n",
    "foaming\n",
    "guts\n",
    "knows\n",
    "lived\n",
    "minuend\n",
    "molested\n",
    "mown\n",
    "pears\n",
    "probed\n",
    "rhubarb\n",
    "rioted\n",
    "shinier\n",
    "solaria\n",
    "staple\n",
    "tops\n",
    "wide\n",
    "winced\n",
    "```\n",
    "\n",
    "# Part 1\n",
    "\n",
    "Your wordsearch puzzle is given in [10-wordsearch.txt](10-wordsearch.txt). \n",
    "\n",
    "What is the total of the lengths of all the words present in the puzzle?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After you've solved the wordsearch and found all the words you can, you'll have some letters unused in any word. For the example wordsearch, once you've found the words, you're left with this:\n",
    "\n",
    "```\n",
    "fhj.a....q\n",
    "w....uq..v\n",
    "i.a....h..\n",
    "..e....i..\n",
    "..........\n",
    "....a.....\n",
    "....p.f...\n",
    "........s.\n",
    ".i..h.npn.\n",
    "b......okr\n",
    "```\n",
    "The letters remaining in the grid are `aaabeffhhhiiijknnoppqqrsuvw`. 9 of those letters are vowels. \n",
    "\n",
    "# Part 2\n",
    "\n",
    "Your wordsearch puzzle is still given in [10-wordsearch.txt](10-wordsearch.txt).\n",
    "\n",
    "How many vowels are left over after solving this puzzle?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Worked solution: Part 1\n",
    "After all the algorithmic headscratching of the previous couple of days, time for something a bit more sedate. \n",
    "\n",
    "## Data structures\n",
    "This is fairly obvious. The wordsearch grid is just a grid of letters. I could store it as\n",
    "* a list of lists of letters\n",
    "* a list of strings\n",
    "* a dict of letters, with key being the (row, column) pair/2-tuple\n",
    "\n",
    "As I'll be updating the grid in part 2, it makes sense to have something mutable, so the list of strings will be awkward. Beyond that, there's not much to choose, so I went for a list of lists of letters. \n",
    "\n",
    "I use the `enum` library to build a new type for directions. This makes the program a bit clearer, as well as prevents typos if I have to refer to a particular direction. The `delta` dict converts from a direction to the row and column changes that move one step in that direction. \n",
    "\n",
    "## Examining the grid\n",
    "With a standard string or list, Python (along with many other languages) has the idea of a _slice_: a section of the list identified by its start position and length. The function `gslice` (for \"grid slice\") does the same thing: return a slice of grid, defined by starting row and column, length, and direction. All `gslice` does is return the cell called out by `indices`. `indices` uses the `delta` dict to convert the direction into the row and column changes, and also does the bounds checking to only return (row, column) pairs that are in the grid.\n",
    "\n",
    "`present_many` just does an exhaustive search of the grid, finding all possible grid slices (all rows, column, directions, and valid lengths) and checking if the slice returned is one of the given words. \n",
    "\n",
    "(There's another version of `present_many` that uses comprehensions to find the words, but that's slower as it keeps having to call `gslice` and `indices` for checking.)\n",
    "\n",
    "# Worked solution: Part 2\n",
    "This builds on part 1. First, I find all the words. I then take all those word positions and, on a copy of the grid, replace the characters in the word position with dots, using the `set_grid` procedure. Then it's just a matter of concatenating the grid into one long string, filtering to keep only the vowels, and counting how many there are."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import string\n",
    "import re\n",
    "import collections\n",
    "import copy\n",
    "import os\n",
    "\n",
    "from enum import Enum\n",
    "Direction = Enum('Direction', 'left right up down upleft upright downleft downright')\n",
    "    \n",
    "delta = {Direction.left: (0, -1),Direction.right: (0, 1), \n",
    "         Direction.up: (-1, 0), Direction.down: (1, 0), \n",
    "         Direction.upleft: (-1, -1), Direction.upright: (-1, 1), \n",
    "         Direction.downleft: (1, -1), Direction.downright: (1, 1)}\n",
    "\n",
    "cat = ''.join\n",
    "wcat = ' '.join\n",
    "lcat = '\\n'.join"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def empty_grid(w, h):\n",
    "    return [['.' for c in range(w)] for r in range(h)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def show_grid(grid):\n",
    "    return lcat(cat(r) for r in grid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def indices(grid, r, c, l, d):\n",
    "    dr, dc = delta[d]\n",
    "    w = len(grid[0])\n",
    "    h = len(grid)\n",
    "    inds = [(r + i * dr, c + i * dc) for i in range(l)]\n",
    "    return [(i, j) for i, j in inds\n",
    "           if i >= 0\n",
    "           if j >= 0\n",
    "           if i < h\n",
    "           if j < w]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def gslice(grid, r, c, l, d):\n",
    "    return [grid[i][j] for i, j in indices(grid, r, c, l, d)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def set_grid(grid, r, c, d, word):\n",
    "    for (i, j), l in zip(indices(grid, r, c, len(word), d), word):\n",
    "        grid[i][j] = l\n",
    "    return grid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def present_many(grid, words):\n",
    "    w = len(grid[0])\n",
    "    h = len(grid)\n",
    "    wordlens = set(len(w) for w in words)\n",
    "    presences = []\n",
    "    for r in range(h):\n",
    "        for c in range(w):\n",
    "            for d in Direction:\n",
    "                for wordlen in wordlens:\n",
    "                    word = cat(gslice(grid, r, c, wordlen, d))\n",
    "                    if len(word) == wordlen and word in words:\n",
    "                        presences += [(word, r, c, d)]\n",
    "    return set(presences)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def present_many_c(grid, words):\n",
    "    w = len(grid[0])\n",
    "    h = len(grid)\n",
    "    wordlens = set(len(w) for w in words)\n",
    "    presences = set((cat(gslice(grid, r, c, wordlen, d)), r, c, d)\n",
    "                    for r in range(h)\n",
    "                    for c in range(w)\n",
    "                    for d in Direction\n",
    "                    for wordlen in wordlens\n",
    "                    if len(indices(grid, r, c, wordlen, d)) == wordlen\n",
    "                    if cat(gslice(grid, r, c, wordlen, d)) in words )\n",
    "    return set(presences)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def read_wordsearch(fn):\n",
    "    lines = [l.strip() for l in open(fn).readlines()]\n",
    "    w, h = [int(s) for s in lines[0].split('x')]\n",
    "    grid = lines[1:h+1]\n",
    "    words = set(lines[h+1:])\n",
    "    return w, h, grid, words"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# All wordsearch puzzles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def read_all_wordsearch(fn):\n",
    "    with open(fn) as f:\n",
    "        text = f.read().strip()\n",
    "        puzzles_text = text.split('\\n\\n')\n",
    "        puzzles = []\n",
    "        for p in puzzles_text:\n",
    "            lines = p.splitlines()\n",
    "            w, h = [int(s) for s in lines[0].split('x')]\n",
    "            grid = lines[1:h+1]\n",
    "            words = lines[h+1:]\n",
    "            puzzles += [(w, h, grid, words)]\n",
    "    return puzzles"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Huge wordsearch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(100, 100)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "puzzle = read_wordsearch('10-wordsearch.txt')\n",
    "puzzle[:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def found_words_length(puzzle):\n",
    "    width, height, grid, words = puzzle\n",
    "    return sum(len(p[0]) for p in present_many(grid, words))\n",
    "\n",
    "def total_found_words_length(puzzles):\n",
    "    return sum(found_words_length(p) for p in puzzles)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def found_words_length_c(puzzle):\n",
    "    width, height, grid, words = puzzle\n",
    "    return sum(len(p[0]) for p in present_many_c(grid, words))\n",
    "\n",
    "def total_found_words_length_c(puzzles):\n",
    "    return sum(found_words_length_c(p) for p in puzzles)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8092"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "found_words_length(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 7.4 s per loop\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "found_words_length(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8092"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "found_words_length_c(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 12 s per loop\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "found_words_length_c(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1149, 1149, 1149)"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "width, height, grid, words = puzzle\n",
    "presences = present_many(grid, words)\n",
    "found_words = [p[0] for p in presences]\n",
    "len(presences), len(found_words), len(set(found_words))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 7.4 s per loop\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "present_many(grid, words)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 11.9 s per loop\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "present_many_c(grid, words)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1149, 1149)"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "found_words = [p[0] for p in presences]\n",
    "len(found_words), len(set(found_words))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def max_unfound_word_length(puzzle):\n",
    "    width, height, grid, words = puzzle\n",
    "    presences = present_many(grid, words)\n",
    "    used_words = [p[0] for p in presences]\n",
    "    unused_words = [w for w in words if w not in used_words]\n",
    "    \n",
    "    unused_grid = [[c for c in r] for r in grid]\n",
    "    for w, r, c, d in presences:\n",
    "        set_grid(unused_grid, r, c, d, '.' * len(w))\n",
    "    unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
    "    unused_letter_count = collections.Counter(unused_letters)\n",
    "    \n",
    "    makeable_words = []\n",
    "    for w in unused_words:\n",
    "        unused_word_count = collections.Counter(w)\n",
    "        if all(unused_word_count[l] <= unused_letter_count[l] for l in unused_word_count):\n",
    "            makeable_words += [w]\n",
    "    lwm = max(len(w) for w in makeable_words)\n",
    "    return lwm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def unused_letters(puzzle):\n",
    "    width, height, grid, words = puzzle\n",
    "    presences = present_many(grid, words)\n",
    "#     used_words = [p[0] for p in presences]\n",
    "#     unused_words = [w for w in words if w not in used_words]\n",
    "    \n",
    "    unused_grid = [[c for c in r] for r in grid]\n",
    "    for w, r, c, d in presences:\n",
    "        set_grid(unused_grid, r, c, d, '.' * len(w))\n",
    "    unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
    "    unused_letter_count = collections.Counter(unused_letters)\n",
    "    \n",
    "    return used_words, unused_letter_count"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def unused_vowels(puzzle):\n",
    "    width, height, grid, words = puzzle\n",
    "    presences = present_many(grid, words)\n",
    "#     used_words = [p[0] for p in presences]\n",
    "#     unused_words = [w for w in words if w not in used_words]\n",
    "    \n",
    "    unused_grid = [[c for c in r] for r in grid]\n",
    "    for w, r, c, d in presences:\n",
    "        set_grid(unused_grid, r, c, d, '.' * len(w))\n",
    "    unused_vowel_count = sum(1 for l in unused_grid for c in l if c in 'aeiou')\n",
    "    return unused_vowel_count"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def total_max_unfound_word_length(puzzles):\n",
    "    return sum(max_unfound_word_length(p) for p in puzzles)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "594"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "unused_vowels(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loop, best of 3: 30.8 s per loop\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "unused_vowels(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# max_unfound_word_length(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# %%timeit\n",
    "# max_unfound_word_length(puzzle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2217"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "uw, unlc = unused_letters(puzzle)\n",
    "sum(unlc[l] for l in unlc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}