61d009756777b3cb64ff1bd1b0547a07cdf247a5
[ou-summer-of-code-2017.git] / 10-word-search / wordsearch-solution.ipynb
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "First nights in hotels are always a bit of an anticlimax, what with the recovery from travel and all. You decided to do one of your wordsearch puzzles.\n",
8 "\n",
9 "These puzzles are a bit different from normal because they have a puzzle grid and a list of words, but only some of the words are in the puzzle; some of the words given are decoys and aren't present.\n",
10 "\n",
11 "For instance, given the grid:\n",
12 "\n",
13 "```\n",
14 "fhjpamownq\n",
15 "wsdseuqeev\n",
16 "ieaeladhcr\n",
17 "piedtiriah\n",
18 "rzsiwovspu\n",
19 "oawhakieab\n",
20 "browpcfrda\n",
21 "ecnotopssr\n",
22 "dikchdnpnb\n",
23 "bstapleokr\n",
24 "```\n",
25 "and the list of words:\n",
26 "\n",
27 "* adapting, apace, bombing, cackles, carnal, chump, coccyxes, cowhides, crazies, crumbled, dock, fickler, foaming, guts, knows, lived, minuend, molested, mown, pears, probed, rhubarb, rioted, shinier, solaria, staple, tops, wide, winced\n",
28 "\n",
29 "you can find these words:\n",
30 "\n",
31 "* apace, cowhides, crazies, dock, knows, lived, mown, pears, probed, rhubarb, rioted, staple, tops, wide\n",
32 "\n",
33 "but these are the decoys:\n",
34 "\n",
35 "* adapting, bombing, cackles, carnal, chump, coccyxes, crumbled, fickler, foaming, guts, minuend, molested, shinier, solaria, winced\n",
36 "\n",
37 "For this puzzle, there are 14 words with a total length of 76 letters. (Some of the words may overlap in the grid, but don't worry about that when counting word lengths in your solution.)\n",
38 "\n",
39 "## About wordsearches\n",
40 "\n",
41 "Words can go in any of the eight directions (up, down, left, right, and diagonals) in a straight line. A letter in the grid can be in more than one word. Words don't wrap around the edges of the grid.\n",
42 "\n",
43 "In the example above, the words \"lived\", \"wide\" and \"staple\" are in these positions (two words are diagonal and share a letter).\n",
44 "\n",
45 "```\n",
46 "..........\n",
47 ".......e..\n",
48 "....l.d...\n",
49 ".....i....\n",
50 "....w.v...\n",
51 ".......e..\n",
52 "........d.\n",
53 "..........\n",
54 "..........\n",
55 ".staple...\n",
56 "```\n",
57 "\n",
58 "The longest word, \"cowhides\", runs vertically upwards:\n",
59 "\n",
60 "```\n",
61 "..........\n",
62 "...s......\n",
63 "...e......\n",
64 "...d......\n",
65 "...i......\n",
66 "...h......\n",
67 "...w......\n",
68 "...o......\n",
69 "...c......\n",
70 "..........\n",
71 "```\n",
72 "\n",
73 "If there are words present in the grid that aren't in the list of words given, ignore them. For instance, you can see the word \"brow\" running left to right on the seventh row of the grid, but that doesn't count as a word in this puzzle because \"brow\" isn't in the list of words you're given.\n",
74 "\n",
75 "You're safe to assume that each word in the puzzle is present either zero or one times, never more.\n",
76 "\n",
77 "## File format\n",
78 "The wordsearch puzzle is given as a text file. The first line of the file is WxH, where W and H are the width and height of the puzzle grid, in characters. The next H lines are the grid, each line being W characters long. Finally, there's an arbitrary number of words to look for, one on each line.\n",
79 "\n",
80 "Ignore any trailing or leading blank lines, and any whitespace on a line.\n",
81 "\n",
82 "The example puzzle above, a ten by ten grid, would be written to a file as:\n",
83 "\n",
84 "```\n",
85 "10x10\n",
86 "fhjpamownq\n",
87 "wsdseuqeev\n",
88 "ieaeladhcr\n",
89 "piedtiriah\n",
90 "rzsiwovspu\n",
91 "oawhakieab\n",
92 "browpcfrda\n",
93 "ecnotopssr\n",
94 "dikchdnpnb\n",
95 "bstapleokr\n",
96 "adapting\n",
97 "apace\n",
98 "bombing\n",
99 "cackles\n",
100 "carnal\n",
101 "chump\n",
102 "coccyxes\n",
103 "cowhides\n",
104 "crazies\n",
105 "crumbled\n",
106 "dock\n",
107 "fickler\n",
108 "foaming\n",
109 "guts\n",
110 "knows\n",
111 "lived\n",
112 "minuend\n",
113 "molested\n",
114 "mown\n",
115 "pears\n",
116 "probed\n",
117 "rhubarb\n",
118 "rioted\n",
119 "shinier\n",
120 "solaria\n",
121 "staple\n",
122 "tops\n",
123 "wide\n",
124 "winced\n",
125 "```\n",
126 "\n",
127 "# Part 1\n",
128 "\n",
129 "Your wordsearch puzzle is given in [10-wordsearch.txt](10-wordsearch.txt). \n",
130 "\n",
131 "What is the total of the lengths of all the words present in the puzzle?"
132 ]
133 },
134 {
135 "cell_type": "markdown",
136 "metadata": {},
137 "source": [
138 "After you've solved the wordsearch and found all the words you can, you'll have some letters unused in any word. For the example wordsearch, once you've found the words, you're left with this:\n",
139 "\n",
140 "```\n",
141 "fhj.a....q\n",
142 "w....uq..v\n",
143 "i.a....h..\n",
144 "..e....i..\n",
145 "..........\n",
146 "....a.....\n",
147 "....p.f...\n",
148 "........s.\n",
149 ".i..h.npn.\n",
150 "b......okr\n",
151 "```\n",
152 "The letters remaining in the grid are `aaabeffhhhiiijknnoppqqrsuvw`. 9 of those letters are vowels. \n",
153 "\n",
154 "# Part 2\n",
155 "\n",
156 "Your wordsearch puzzle is still given in [10-wordsearch.txt](10-wordsearch.txt).\n",
157 "\n",
158 "How many vowels are left over after solving this puzzle?"
159 ]
160 },
161 {
162 "cell_type": "code",
163 "execution_count": 1,
164 "metadata": {
165 "collapsed": true
166 },
167 "outputs": [],
168 "source": [
169 "import string\n",
170 "import re\n",
171 "import collections\n",
172 "import copy\n",
173 "import os\n",
174 "\n",
175 "from enum import Enum\n",
176 "Direction = Enum('Direction', 'left right up down upleft upright downleft downright')\n",
177 " \n",
178 "delta = {Direction.left: (0, -1),Direction.right: (0, 1), \n",
179 " Direction.up: (-1, 0), Direction.down: (1, 0), \n",
180 " Direction.upleft: (-1, -1), Direction.upright: (-1, 1), \n",
181 " Direction.downleft: (1, -1), Direction.downright: (1, 1)}\n",
182 "\n",
183 "cat = ''.join\n",
184 "wcat = ' '.join\n",
185 "lcat = '\\n'.join"
186 ]
187 },
188 {
189 "cell_type": "code",
190 "execution_count": 2,
191 "metadata": {
192 "collapsed": true
193 },
194 "outputs": [],
195 "source": [
196 "def empty_grid(w, h):\n",
197 " return [['.' for c in range(w)] for r in range(h)]"
198 ]
199 },
200 {
201 "cell_type": "code",
202 "execution_count": 3,
203 "metadata": {
204 "collapsed": true
205 },
206 "outputs": [],
207 "source": [
208 "def show_grid(grid):\n",
209 " return lcat(cat(r) for r in grid)"
210 ]
211 },
212 {
213 "cell_type": "code",
214 "execution_count": 4,
215 "metadata": {
216 "collapsed": true
217 },
218 "outputs": [],
219 "source": [
220 "def indices(grid, r, c, l, d):\n",
221 " dr, dc = delta[d]\n",
222 " w = len(grid[0])\n",
223 " h = len(grid)\n",
224 " inds = [(r + i * dr, c + i * dc) for i in range(l)]\n",
225 " return [(i, j) for i, j in inds\n",
226 " if i >= 0\n",
227 " if j >= 0\n",
228 " if i < h\n",
229 " if j < w]"
230 ]
231 },
232 {
233 "cell_type": "code",
234 "execution_count": 5,
235 "metadata": {
236 "collapsed": true
237 },
238 "outputs": [],
239 "source": [
240 "def gslice(grid, r, c, l, d):\n",
241 " return [grid[i][j] for i, j in indices(grid, r, c, l, d)]"
242 ]
243 },
244 {
245 "cell_type": "code",
246 "execution_count": 6,
247 "metadata": {
248 "collapsed": true
249 },
250 "outputs": [],
251 "source": [
252 "def set_grid(grid, r, c, d, word):\n",
253 " for (i, j), l in zip(indices(grid, r, c, len(word), d), word):\n",
254 " grid[i][j] = l\n",
255 " return grid"
256 ]
257 },
258 {
259 "cell_type": "code",
260 "execution_count": 7,
261 "metadata": {
262 "collapsed": true
263 },
264 "outputs": [],
265 "source": [
266 "def present_many(grid, words):\n",
267 " w = len(grid[0])\n",
268 " h = len(grid)\n",
269 " wordlens = set(len(w) for w in words)\n",
270 " presences = []\n",
271 " for r in range(h):\n",
272 " for c in range(w):\n",
273 " for d in Direction:\n",
274 " for wordlen in wordlens:\n",
275 " word = cat(gslice(grid, r, c, wordlen, d))\n",
276 " if word in words:\n",
277 " presences += [(word, r, c, d)]\n",
278 " return set(presences)"
279 ]
280 },
281 {
282 "cell_type": "code",
283 "execution_count": 30,
284 "metadata": {
285 "collapsed": true
286 },
287 "outputs": [],
288 "source": [
289 "def read_wordsearch(fn):\n",
290 " lines = [l.strip() for l in open(fn).readlines()]\n",
291 " w, h = [int(s) for s in lines[0].split('x')]\n",
292 " grid = lines[1:h+1]\n",
293 " words = set(lines[h+1:])\n",
294 " return w, h, grid, words"
295 ]
296 },
297 {
298 "cell_type": "markdown",
299 "metadata": {},
300 "source": [
301 "# All wordsearch puzzles"
302 ]
303 },
304 {
305 "cell_type": "code",
306 "execution_count": 9,
307 "metadata": {
308 "collapsed": true
309 },
310 "outputs": [],
311 "source": [
312 "def read_all_wordsearch(fn):\n",
313 " with open(fn) as f:\n",
314 " text = f.read().strip()\n",
315 " puzzles_text = text.split('\\n\\n')\n",
316 " puzzles = []\n",
317 " for p in puzzles_text:\n",
318 " lines = p.splitlines()\n",
319 " w, h = [int(s) for s in lines[0].split('x')]\n",
320 " grid = lines[1:h+1]\n",
321 " words = lines[h+1:]\n",
322 " puzzles += [(w, h, grid, words)]\n",
323 " return puzzles"
324 ]
325 },
326 {
327 "cell_type": "markdown",
328 "metadata": {},
329 "source": [
330 "## Huge wordsearch"
331 ]
332 },
333 {
334 "cell_type": "code",
335 "execution_count": 31,
336 "metadata": {},
337 "outputs": [
338 {
339 "data": {
340 "text/plain": [
341 "(100, 100)"
342 ]
343 },
344 "execution_count": 31,
345 "metadata": {},
346 "output_type": "execute_result"
347 }
348 ],
349 "source": [
350 "puzzle = read_wordsearch('10-wordsearch.txt')\n",
351 "puzzle[:2]"
352 ]
353 },
354 {
355 "cell_type": "markdown",
356 "metadata": {},
357 "source": [
358 "## Part 1"
359 ]
360 },
361 {
362 "cell_type": "code",
363 "execution_count": 32,
364 "metadata": {
365 "collapsed": true
366 },
367 "outputs": [],
368 "source": [
369 "def found_words_length(puzzle):\n",
370 " width, height, grid, words = puzzle\n",
371 " return sum(len(p[0]) for p in present_many(grid, words))\n",
372 "\n",
373 "def total_found_words_length(puzzles):\n",
374 " return sum(found_words_length(p) for p in puzzles)"
375 ]
376 },
377 {
378 "cell_type": "code",
379 "execution_count": 33,
380 "metadata": {},
381 "outputs": [
382 {
383 "data": {
384 "text/plain": [
385 "8092"
386 ]
387 },
388 "execution_count": 33,
389 "metadata": {},
390 "output_type": "execute_result"
391 }
392 ],
393 "source": [
394 "found_words_length(puzzle)"
395 ]
396 },
397 {
398 "cell_type": "code",
399 "execution_count": 34,
400 "metadata": {},
401 "outputs": [
402 {
403 "name": "stdout",
404 "output_type": "stream",
405 "text": [
406 "1 loop, best of 3: 6.79 s per loop\n"
407 ]
408 }
409 ],
410 "source": [
411 "%%timeit\n",
412 "found_words_length(puzzle)"
413 ]
414 },
415 {
416 "cell_type": "code",
417 "execution_count": 35,
418 "metadata": {},
419 "outputs": [
420 {
421 "data": {
422 "text/plain": [
423 "(1149, 1149, 1149)"
424 ]
425 },
426 "execution_count": 35,
427 "metadata": {},
428 "output_type": "execute_result"
429 }
430 ],
431 "source": [
432 "width, height, grid, words = puzzle\n",
433 "presences = present_many(grid, words)\n",
434 "found_words = [p[0] for p in presences]\n",
435 "len(presences), len(found_words), len(set(found_words))"
436 ]
437 },
438 {
439 "cell_type": "code",
440 "execution_count": 38,
441 "metadata": {},
442 "outputs": [
443 {
444 "name": "stdout",
445 "output_type": "stream",
446 "text": [
447 "1 loop, best of 3: 6.76 s per loop\n"
448 ]
449 }
450 ],
451 "source": [
452 "%%timeit\n",
453 "presences = present_many(grid, words)"
454 ]
455 },
456 {
457 "cell_type": "code",
458 "execution_count": 37,
459 "metadata": {},
460 "outputs": [
461 {
462 "data": {
463 "text/plain": [
464 "(1149, 1149)"
465 ]
466 },
467 "execution_count": 37,
468 "metadata": {},
469 "output_type": "execute_result"
470 }
471 ],
472 "source": [
473 "found_words = [p[0] for p in presences]\n",
474 "len(found_words), len(set(found_words))"
475 ]
476 },
477 {
478 "cell_type": "markdown",
479 "metadata": {},
480 "source": [
481 "## Part 2"
482 ]
483 },
484 {
485 "cell_type": "code",
486 "execution_count": 15,
487 "metadata": {
488 "collapsed": true
489 },
490 "outputs": [],
491 "source": [
492 "def max_unfound_word_length(puzzle):\n",
493 " width, height, grid, words = puzzle\n",
494 " presences = present_many(grid, words)\n",
495 " used_words = [p[0] for p in presences]\n",
496 " unused_words = [w for w in words if w not in used_words]\n",
497 " \n",
498 " unused_grid = [[c for c in r] for r in grid]\n",
499 " for w, r, c, d in presences:\n",
500 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
501 " unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
502 " unused_letter_count = collections.Counter(unused_letters)\n",
503 " \n",
504 " makeable_words = []\n",
505 " for w in unused_words:\n",
506 " unused_word_count = collections.Counter(w)\n",
507 " if all(unused_word_count[l] <= unused_letter_count[l] for l in unused_word_count):\n",
508 " makeable_words += [w]\n",
509 " lwm = max(len(w) for w in makeable_words)\n",
510 " return lwm"
511 ]
512 },
513 {
514 "cell_type": "code",
515 "execution_count": 16,
516 "metadata": {
517 "collapsed": true
518 },
519 "outputs": [],
520 "source": [
521 "def unused_letters(puzzle):\n",
522 " width, height, grid, words = puzzle\n",
523 " presences = present_many(grid, words)\n",
524 " used_words = [p[0] for p in presences]\n",
525 " unused_words = [w for w in words if w not in used_words]\n",
526 " \n",
527 " unused_grid = [[c for c in r] for r in grid]\n",
528 " for w, r, c, d in presences:\n",
529 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
530 " unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
531 " unused_letter_count = collections.Counter(unused_letters)\n",
532 " \n",
533 " return used_words, unused_letter_count"
534 ]
535 },
536 {
537 "cell_type": "code",
538 "execution_count": 17,
539 "metadata": {
540 "collapsed": true
541 },
542 "outputs": [],
543 "source": [
544 "def unused_vowels(puzzle):\n",
545 " width, height, grid, words = puzzle\n",
546 " presences = present_many(grid, words)\n",
547 " used_words = [p[0] for p in presences]\n",
548 " unused_words = [w for w in words if w not in used_words]\n",
549 " \n",
550 " unused_grid = [[c for c in r] for r in grid]\n",
551 " for w, r, c, d in presences:\n",
552 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
553 " unused_vowel_count = sum(1 for l in unused_grid for c in l if c in 'aeiou')\n",
554 " return unused_vowel_count"
555 ]
556 },
557 {
558 "cell_type": "code",
559 "execution_count": 18,
560 "metadata": {
561 "collapsed": true
562 },
563 "outputs": [],
564 "source": [
565 "def total_max_unfound_word_length(puzzles):\n",
566 " return sum(max_unfound_word_length(p) for p in puzzles)"
567 ]
568 },
569 {
570 "cell_type": "code",
571 "execution_count": 19,
572 "metadata": {},
573 "outputs": [
574 {
575 "data": {
576 "text/plain": [
577 "594"
578 ]
579 },
580 "execution_count": 19,
581 "metadata": {},
582 "output_type": "execute_result"
583 }
584 ],
585 "source": [
586 "unused_vowels(puzzle)"
587 ]
588 },
589 {
590 "cell_type": "code",
591 "execution_count": 20,
592 "metadata": {},
593 "outputs": [
594 {
595 "name": "stdout",
596 "output_type": "stream",
597 "text": [
598 "1 loop, best of 3: 30.8 s per loop\n"
599 ]
600 }
601 ],
602 "source": [
603 "%%timeit\n",
604 "unused_vowels(puzzle)"
605 ]
606 },
607 {
608 "cell_type": "code",
609 "execution_count": 21,
610 "metadata": {
611 "collapsed": true
612 },
613 "outputs": [],
614 "source": [
615 "# max_unfound_word_length(puzzle)"
616 ]
617 },
618 {
619 "cell_type": "code",
620 "execution_count": 22,
621 "metadata": {
622 "collapsed": true
623 },
624 "outputs": [],
625 "source": [
626 "# %%timeit\n",
627 "# max_unfound_word_length(puzzle)"
628 ]
629 },
630 {
631 "cell_type": "code",
632 "execution_count": 23,
633 "metadata": {},
634 "outputs": [
635 {
636 "data": {
637 "text/plain": [
638 "2217"
639 ]
640 },
641 "execution_count": 23,
642 "metadata": {},
643 "output_type": "execute_result"
644 }
645 ],
646 "source": [
647 "uw, unlc = unused_letters(puzzle)\n",
648 "sum(unlc[l] for l in unlc)"
649 ]
650 },
651 {
652 "cell_type": "code",
653 "execution_count": null,
654 "metadata": {
655 "collapsed": true
656 },
657 "outputs": [],
658 "source": []
659 }
660 ],
661 "metadata": {
662 "kernelspec": {
663 "display_name": "Python 3",
664 "language": "python",
665 "name": "python3"
666 },
667 "language_info": {
668 "codemirror_mode": {
669 "name": "ipython",
670 "version": 3
671 },
672 "file_extension": ".py",
673 "mimetype": "text/x-python",
674 "name": "python",
675 "nbconvert_exporter": "python",
676 "pygments_lexer": "ipython3",
677 "version": "3.5.2+"
678 }
679 },
680 "nbformat": 4,
681 "nbformat_minor": 1
682 }