Added Java solutions from Anton
[ou-summer-of-code-2017.git] / 04-word-search / wordsearch-solution.ipynb
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# Wordsearch\n",
8 "Given a text file, consisting of three parts (a grid size, a grid, and a list of words), find:\n",
9 "* the words present in the grid, \n",
10 "* the longest word present in the grid, \n",
11 "* the number of words not present in the grid, \n",
12 "* the longest word not present that can be formed from the leftover letters\n",
13 "\n",
14 "The only words that need be considered are the ones given in the list in the puzzle input.\n",
15 "\n",
16 "The puzzle consists of:\n",
17 "1. A line consisting of _w_`x`_h_, where _w_ and _h_ are integers giving the width and height of the grid.\n",
18 "2. The grid itself, consisting of _h_ lines each of _w_ letters.\n",
19 "3. A list of words, one word per line, of arbitrary length. "
20 ]
21 },
22 {
23 "cell_type": "markdown",
24 "metadata": {},
25 "source": [
26 "## Example\n",
27 "\n",
28 "\n",
29 "`...p.mown.\n",
30 ".sdse..ee.\n",
31 ".e.elad.cr\n",
32 "pi.dtir.ah\n",
33 "rzsiwovspu\n",
34 "oawh.kieab\n",
35 "brow.c.rda\n",
36 "ecnotops.r\n",
37 "d.kc.d...b\n",
38 ".staple...`\n",
39 "\n",
40 "```\n",
41 "fhjpamownq\n",
42 "wsdseuqeev\n",
43 "ieaeladhcr\n",
44 "piedtiriah\n",
45 "rzsiwovspu\n",
46 "oawhakieab\n",
47 "browpcfrda\n",
48 "ecnotopssr\n",
49 "dikchdnpnb\n",
50 "bstapleokr\n",
51 "```\n",
52 "\n",
53 "14 words added; 6 directions\n",
54 "\n",
55 "Present: apace cowhides crazies dock knows lived mown pears probed rhubarb rioted staple tops wide\n",
56 "\n",
57 "Decoys: adapting bombing boor brick cackles carnal casino chaplets chump coaster coccyxes coddle collies creels crumbled cunt curds curled curlier deepen demeanor dicier dowses ensuing faddish fest fickler foaming gambol garoting gliding gristle grunts guts ibex impugns instants kielbasy lanyard loamier lugs market meanly minuend misprint mitts molested moonshot mucking oaks olives orgasmic pastrami perfect proceed puckered quashed refined regards retraces revel ridges ringlet scoff shinier siren solaria sprain sunder sunup tamped tapes thirds throw tiller times trains tranquil transfix typesets uric wariness welts whimsy winced winced\n",
58 "\n",
59 "Decoys: fickler, adapting, chump, foaming, molested, carnal, crumbled, guts, minuend, bombing, winced, coccyxes, solaria, shinier, cackles\n",
60 "\n",
61 "All words: adapting, apace, bombing, cackles, carnal, chump, coccyxes, cowhides, crazies, crumbled, dock, fickler, foaming, guts, knows, lived, minuend, molested, mown, pears, probed, rhubarb, rioted, shinier, solaria, staple, tops, wide, winced\n",
62 "\n",
63 "Directions: [('probed', '`(True, 3, 0, <Direction.down: 4>)`'), ('staple', '`(True, 9, 1, <Direction.right: 2>)`'), ('rioted', '`(True, 6, 7, <Direction.upleft: 5>)`'), ('cowhides', '`(True, 8, 3, <Direction.up: 3>)`'), ('tops', '`(True, 7, 4, <Direction.right: 2>)`'), ('knows', '`(True, 8, 2, <Direction.up: 3>)`'), ('lived', '`(True, 2, 4, <Direction.downright: 8>)`'), ('rhubarb', '`(True, 2, 9, <Direction.down: 4>)`'), ('crazies', '`(True, 7, 1, <Direction.up: 3>)`'), ('dock', '`(True, 8, 5, <Direction.up: 3>)`'), ('apace', '`(True, 5, 8, <Direction.up: 3>)`'), ('mown', '`(True, 0, 5, <Direction.right: 2>)`'), ('pears', '`(True, 0, 3, <Direction.downright: 8>)`'), ('wide', '`(True, 4, 4, <Direction.upright: 6>)`')]"
64 ]
65 },
66 {
67 "cell_type": "code",
68 "execution_count": 114,
69 "metadata": {
70 "collapsed": true
71 },
72 "outputs": [],
73 "source": [
74 "import string\n",
75 "import re\n",
76 "import collections\n",
77 "import copy\n",
78 "import os\n",
79 "\n",
80 "from enum import Enum\n",
81 "Direction = Enum('Direction', 'left right up down upleft upright downleft downright')\n",
82 " \n",
83 "delta = {Direction.left: (0, -1),Direction.right: (0, 1), \n",
84 " Direction.up: (-1, 0), Direction.down: (1, 0), \n",
85 " Direction.upleft: (-1, -1), Direction.upright: (-1, 1), \n",
86 " Direction.downleft: (1, -1), Direction.downright: (1, 1)}\n",
87 "\n",
88 "cat = ''.join\n",
89 "wcat = ' '.join\n",
90 "lcat = '\\n'.join"
91 ]
92 },
93 {
94 "cell_type": "code",
95 "execution_count": 115,
96 "metadata": {
97 "collapsed": true
98 },
99 "outputs": [],
100 "source": [
101 "def empty_grid(w, h):\n",
102 " return [['.' for c in range(w)] for r in range(h)]"
103 ]
104 },
105 {
106 "cell_type": "code",
107 "execution_count": 116,
108 "metadata": {
109 "collapsed": true
110 },
111 "outputs": [],
112 "source": [
113 "def show_grid(grid):\n",
114 " return lcat(cat(r) for r in grid)"
115 ]
116 },
117 {
118 "cell_type": "code",
119 "execution_count": 117,
120 "metadata": {
121 "collapsed": true
122 },
123 "outputs": [],
124 "source": [
125 "def indices(grid, r, c, l, d):\n",
126 " dr, dc = delta[d]\n",
127 " w = len(grid[0])\n",
128 " h = len(grid)\n",
129 " inds = [(r + i * dr, c + i * dc) for i in range(l)]\n",
130 " return [(i, j) for i, j in inds\n",
131 " if i >= 0\n",
132 " if j >= 0\n",
133 " if i < h\n",
134 " if j < w]"
135 ]
136 },
137 {
138 "cell_type": "code",
139 "execution_count": 118,
140 "metadata": {
141 "collapsed": true
142 },
143 "outputs": [],
144 "source": [
145 "def gslice(grid, r, c, l, d):\n",
146 " return [grid[i][j] for i, j in indices(grid, r, c, l, d)]"
147 ]
148 },
149 {
150 "cell_type": "code",
151 "execution_count": 119,
152 "metadata": {
153 "collapsed": true
154 },
155 "outputs": [],
156 "source": [
157 "def set_grid(grid, r, c, d, word):\n",
158 " for (i, j), l in zip(indices(grid, r, c, len(word), d), word):\n",
159 " grid[i][j] = l\n",
160 " return grid"
161 ]
162 },
163 {
164 "cell_type": "code",
165 "execution_count": 120,
166 "metadata": {
167 "collapsed": true
168 },
169 "outputs": [],
170 "source": [
171 "def present_many(grid, words):\n",
172 " w = len(grid[0])\n",
173 " h = len(grid)\n",
174 " wordlens = set(len(w) for w in words)\n",
175 " presences = []\n",
176 " for r in range(h):\n",
177 " for c in range(w):\n",
178 " for d in Direction:\n",
179 " for wordlen in wordlens:\n",
180 " word = cat(gslice(grid, r, c, wordlen, d))\n",
181 " if word in words:\n",
182 " presences += [(word, r, c, d)]\n",
183 " return set(presences)"
184 ]
185 },
186 {
187 "cell_type": "code",
188 "execution_count": 121,
189 "metadata": {
190 "collapsed": true
191 },
192 "outputs": [],
193 "source": [
194 "def read_wordsearch(fn):\n",
195 " lines = [l.strip() for l in open(fn).readlines()]\n",
196 " w, h = [int(s) for s in lines[0].split('x')]\n",
197 " grid = lines[1:h+1]\n",
198 " words = lines[h+1:]\n",
199 " return w, h, grid, words"
200 ]
201 },
202 {
203 "cell_type": "markdown",
204 "metadata": {},
205 "source": [
206 "# All wordsearch puzzles"
207 ]
208 },
209 {
210 "cell_type": "code",
211 "execution_count": 122,
212 "metadata": {
213 "collapsed": true
214 },
215 "outputs": [],
216 "source": [
217 "def read_all_wordsearch(fn):\n",
218 " with open(fn) as f:\n",
219 " text = f.read().strip()\n",
220 " puzzles_text = text.split('\\n\\n')\n",
221 " puzzles = []\n",
222 " for p in puzzles_text:\n",
223 " lines = p.splitlines()\n",
224 " w, h = [int(s) for s in lines[0].split('x')]\n",
225 " grid = lines[1:h+1]\n",
226 " words = lines[h+1:]\n",
227 " puzzles += [(w, h, grid, words)]\n",
228 " return puzzles"
229 ]
230 },
231 {
232 "cell_type": "markdown",
233 "metadata": {},
234 "source": [
235 "## Huge wordsearch"
236 ]
237 },
238 {
239 "cell_type": "code",
240 "execution_count": 123,
241 "metadata": {},
242 "outputs": [
243 {
244 "data": {
245 "text/plain": [
246 "(100, 100)"
247 ]
248 },
249 "execution_count": 123,
250 "metadata": {},
251 "output_type": "execute_result"
252 }
253 ],
254 "source": [
255 "puzzle = read_wordsearch('04-wordsearch.txt')\n",
256 "puzzle[:2]"
257 ]
258 },
259 {
260 "cell_type": "markdown",
261 "metadata": {},
262 "source": [
263 "## Part 1"
264 ]
265 },
266 {
267 "cell_type": "code",
268 "execution_count": 124,
269 "metadata": {
270 "collapsed": true
271 },
272 "outputs": [],
273 "source": [
274 "def found_words_length(puzzle):\n",
275 " width, height, grid, words = puzzle\n",
276 " return sum(len(p[0]) for p in present_many(grid, words))\n",
277 "\n",
278 "def total_found_words_length(puzzles):\n",
279 " return sum(found_words_length(p) for p in puzzles)"
280 ]
281 },
282 {
283 "cell_type": "code",
284 "execution_count": 125,
285 "metadata": {},
286 "outputs": [
287 {
288 "data": {
289 "text/plain": [
290 "8092"
291 ]
292 },
293 "execution_count": 125,
294 "metadata": {},
295 "output_type": "execute_result"
296 }
297 ],
298 "source": [
299 "found_words_length(puzzle)"
300 ]
301 },
302 {
303 "cell_type": "code",
304 "execution_count": 126,
305 "metadata": {},
306 "outputs": [
307 {
308 "name": "stdout",
309 "output_type": "stream",
310 "text": [
311 "1 loop, best of 3: 31.4 s per loop\n"
312 ]
313 }
314 ],
315 "source": [
316 "%%timeit\n",
317 "found_words_length(puzzle)"
318 ]
319 },
320 {
321 "cell_type": "code",
322 "execution_count": 127,
323 "metadata": {},
324 "outputs": [
325 {
326 "data": {
327 "text/plain": [
328 "1149"
329 ]
330 },
331 "execution_count": 127,
332 "metadata": {},
333 "output_type": "execute_result"
334 }
335 ],
336 "source": [
337 "width, height, grid, words = puzzle\n",
338 "len(present_many(grid, words))"
339 ]
340 },
341 {
342 "cell_type": "markdown",
343 "metadata": {},
344 "source": [
345 "## Part 2"
346 ]
347 },
348 {
349 "cell_type": "code",
350 "execution_count": 128,
351 "metadata": {
352 "collapsed": true
353 },
354 "outputs": [],
355 "source": [
356 "def max_unfound_word_length(puzzle):\n",
357 " width, height, grid, words = puzzle\n",
358 " presences = present_many(grid, words)\n",
359 " used_words = [p[0] for p in presences]\n",
360 " unused_words = [w for w in words if w not in used_words]\n",
361 " \n",
362 " unused_grid = [[c for c in r] for r in grid]\n",
363 " for w, r, c, d in presences:\n",
364 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
365 " unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
366 " unused_letter_count = collections.Counter(unused_letters)\n",
367 " \n",
368 " makeable_words = []\n",
369 " for w in unused_words:\n",
370 " unused_word_count = collections.Counter(w)\n",
371 " if all(unused_word_count[l] <= unused_letter_count[l] for l in unused_word_count):\n",
372 " makeable_words += [w]\n",
373 " lwm = max(len(w) for w in makeable_words)\n",
374 " return lwm"
375 ]
376 },
377 {
378 "cell_type": "code",
379 "execution_count": 129,
380 "metadata": {
381 "collapsed": true
382 },
383 "outputs": [],
384 "source": [
385 "def unused_letters(puzzle):\n",
386 " width, height, grid, words = puzzle\n",
387 " presences = present_many(grid, words)\n",
388 " used_words = [p[0] for p in presences]\n",
389 " unused_words = [w for w in words if w not in used_words]\n",
390 " \n",
391 " unused_grid = [[c for c in r] for r in grid]\n",
392 " for w, r, c, d in presences:\n",
393 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
394 " unused_letters = [c for l in unused_grid for c in l if c != '.']\n",
395 " unused_letter_count = collections.Counter(unused_letters)\n",
396 " \n",
397 " return used_words, unused_letter_count"
398 ]
399 },
400 {
401 "cell_type": "code",
402 "execution_count": 130,
403 "metadata": {
404 "collapsed": true
405 },
406 "outputs": [],
407 "source": [
408 "def unused_vowels(puzzle):\n",
409 " width, height, grid, words = puzzle\n",
410 " presences = present_many(grid, words)\n",
411 " used_words = [p[0] for p in presences]\n",
412 " unused_words = [w for w in words if w not in used_words]\n",
413 " \n",
414 " unused_grid = [[c for c in r] for r in grid]\n",
415 " for w, r, c, d in presences:\n",
416 " set_grid(unused_grid, r, c, d, '.' * len(w))\n",
417 " unused_vowel_count = sum(1 for l in unused_grid for c in l if c in 'aeiou')\n",
418 " return unused_vowel_count"
419 ]
420 },
421 {
422 "cell_type": "code",
423 "execution_count": 131,
424 "metadata": {
425 "collapsed": true
426 },
427 "outputs": [],
428 "source": [
429 "def total_max_unfound_word_length(puzzles):\n",
430 " return sum(max_unfound_word_length(p) for p in puzzles)"
431 ]
432 },
433 {
434 "cell_type": "code",
435 "execution_count": 132,
436 "metadata": {},
437 "outputs": [
438 {
439 "data": {
440 "text/plain": [
441 "594"
442 ]
443 },
444 "execution_count": 132,
445 "metadata": {},
446 "output_type": "execute_result"
447 }
448 ],
449 "source": [
450 "unused_vowels(puzzle)"
451 ]
452 },
453 {
454 "cell_type": "code",
455 "execution_count": 133,
456 "metadata": {},
457 "outputs": [
458 {
459 "name": "stdout",
460 "output_type": "stream",
461 "text": [
462 "1 loop, best of 3: 31.3 s per loop\n"
463 ]
464 }
465 ],
466 "source": [
467 "%%timeit\n",
468 "unused_vowels(puzzle)"
469 ]
470 },
471 {
472 "cell_type": "code",
473 "execution_count": 134,
474 "metadata": {
475 "collapsed": true
476 },
477 "outputs": [],
478 "source": [
479 "# max_unfound_word_length(puzzle)"
480 ]
481 },
482 {
483 "cell_type": "code",
484 "execution_count": 135,
485 "metadata": {
486 "collapsed": true
487 },
488 "outputs": [],
489 "source": [
490 "# %%timeit\n",
491 "# max_unfound_word_length(puzzle)"
492 ]
493 },
494 {
495 "cell_type": "code",
496 "execution_count": 138,
497 "metadata": {},
498 "outputs": [
499 {
500 "data": {
501 "text/plain": [
502 "2217"
503 ]
504 },
505 "execution_count": 138,
506 "metadata": {},
507 "output_type": "execute_result"
508 }
509 ],
510 "source": [
511 "uw, unlc = unused_letters(puzzle)\n",
512 "sum(unlc[l] for l in unlc)"
513 ]
514 },
515 {
516 "cell_type": "code",
517 "execution_count": null,
518 "metadata": {
519 "collapsed": true
520 },
521 "outputs": [],
522 "source": []
523 }
524 ],
525 "metadata": {
526 "kernelspec": {
527 "display_name": "Python 3",
528 "language": "python",
529 "name": "python3"
530 },
531 "language_info": {
532 "codemirror_mode": {
533 "name": "ipython",
534 "version": 3
535 },
536 "file_extension": ".py",
537 "mimetype": "text/x-python",
538 "name": "python",
539 "nbconvert_exporter": "python",
540 "pygments_lexer": "ipython3",
541 "version": "3.5.2+"
542 }
543 },
544 "nbformat": 4,
545 "nbformat_minor": 1
546 }