Removing files from data analysis directory
[ou-summer-of-code-2017.git] / 07-interpreter / interpreter-solution.ipynb
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# Machine interpreter\n",
8 "\n",
9 "## Instructions\n",
10 "\n",
11 "After a good day sightseeing, it's back to the hotel and time for some refreshment. Unfortunately, the minibar in your room refuses to open. It's been hacked, with some ransomware! You'll need to provide the correct unlock code so you can get a nice, cold drink.\n",
12 "\n",
13 "You could pay a large chunk of bitcoin, or you could defeat the ransomware some other way. \n",
14 "\n",
15 "You quickly find the schematics of the minibar lock. It's a fairly simple machine. It has four registers, `a`, `b`, `c`, `d`, and a special purpose `pc` register for the program counter. Each register can hold a 64-bit value (far bigger than any number in the programs you'll be running). You can assume all registers hold zero when the program starts.\n",
16 "\n",
17 "On each clock tick, the machine executes the instruction pointed to by the `pc`, then increments `pc`. The machine halts when the machine tries to read from a location beyond the end of the program.\n",
18 "\n",
19 "The machine has only a few instructions. They're listed by handy mnemonics:\n",
20 "\n",
21 "| Instruction | Description |\n",
22 "|:------------|:------------|\n",
23 "| `inc r` | increment contents of register `r` |\n",
24 "| `dec r` | decrement contents of register `r` |\n",
25 "| `set r i` | set contents of register `r` to literal value `i` |\n",
26 "| `cpy r s` | copy contents of register `r` into register `s` | \n",
27 "| `jmp i` | jump to instruction `i` places forward |\n",
28 "| `jpz r i` | jump to instruction `i` places forward if<br>register `r` contains zero, otherwise continue to next instruction |\n",
29 "\n",
30 "The `jmp` and `jpz` instructions jump relative to the current instruction, overriding the normal change in `pc`. `jmp -1` would jump back to the previous instruction; `jpz a 2` would skip the next instruction if register `a` contains zero.\n",
31 "\n",
32 "Before you start execution of a program, you can set the values of some registers.\n",
33 "\n",
34 "For example, this program multiplies the values in the a and b registers, leaving the result in the c register:\n",
35 "\n",
36 "```\n",
37 "set c 0\n",
38 "cpy a d\n",
39 "jpz b 8\n",
40 "dec b\n",
41 "cpy d a\n",
42 "jpz a 4\n",
43 "inc c\n",
44 "dec a\n",
45 "jmp -3\n",
46 "jmp -7\n",
47 "set d 0\n",
48 "```\n",
49 "\n",
50 "# Part 1\n",
51 "\n",
52 "You think you've worked out how to generate the code wanted by the ransomware. The program is given in [07-program](07-program.txt), one instruction per line. \n",
53 "\n",
54 "Starting with register `a` holding 7, and all other registers holding zero, what does register `a` contain when the program finishes?"
55 ]
56 },
57 {
58 "cell_type": "markdown",
59 "metadata": {},
60 "source": [
61 "It seems your guess of 7 as the starting value was wrong.\n",
62 "\n",
63 "# Part 2\n",
64 "\n",
65 "The program is still given in [07-program.txt](07-program.txt), one instruction per line. \n",
66 "\n",
67 "Starting with register `a` holding 937, and all other registers and memory locations holding zero, what does register `a` contain when the program finishes?"
68 ]
69 },
70 {
71 "cell_type": "markdown",
72 "metadata": {},
73 "source": [
74 "# Worked example solution: parts 1 and 2\n",
75 "This follows a very similar idea to [Day 5](../05-display-board/display-board-solution.ipynb): we're building a virtual machine and applying a set of instructions to it.\n",
76 "\n",
77 "The virtual machine is, if anything, simpler than the laser display board from day 5. There are just four registers, the special `pc` register, and the list of instructions. \n",
78 "\n",
79 "Again, as with the laser display board, each operation is defined as Python function, which takes as arguments the instruction's arguments and the machine being operated on. The operations update the machine in-place.\n",
80 "\n",
81 "One decision is over what counts as an 'instruction'. I've taken the view that what's stored in the machine should be as close to what's exectuted as possible. Therefore, each instruction is a pair (2-tuple) of the function that performs that instruction. The first element is the Python function that performs the operation; the second is a list of arguments, either register names or literal number values. See the `run()` procedure for how it's used.\n",
82 "\n",
83 "## Parsing instructions\n",
84 "Another change from the laser display board is how the instructions are parsed. In the same way as before, I use a dispatch table (called `instruction_table`) to relate the instruction names to the Python procedures that perform them. But I also have a the `numeric_args_table` which specifies which arguments to which instructions should be processed as numbers. The `parse` procedure reads both of those and converts the instruction text into the correct Python representation."
85 ]
86 },
87 {
88 "cell_type": "code",
89 "execution_count": 3,
90 "metadata": {
91 "collapsed": true
92 },
93 "outputs": [],
94 "source": [
95 "def new_machine():\n",
96 " return {'pc': 0, \n",
97 " 'a': 0,\n",
98 " 'b': 0, \n",
99 " 'c': 0,\n",
100 " 'd': 0,\n",
101 " 'instructions': []}"
102 ]
103 },
104 {
105 "cell_type": "code",
106 "execution_count": 4,
107 "metadata": {
108 "collapsed": true
109 },
110 "outputs": [],
111 "source": [
112 "def show_machine(machine):\n",
113 " return ', '.join('{}: {}'.format(sk, machine[int(sk) if sk.isnumeric() else sk]) \n",
114 " for sk in sorted(str(k) for k in machine)\n",
115 " if sk != 'instructions')"
116 ]
117 },
118 {
119 "cell_type": "code",
120 "execution_count": 5,
121 "metadata": {
122 "collapsed": true
123 },
124 "outputs": [],
125 "source": [
126 "def inc(reg, machine):\n",
127 " machine[reg] += 1\n",
128 " machine['pc'] += 1"
129 ]
130 },
131 {
132 "cell_type": "code",
133 "execution_count": 6,
134 "metadata": {
135 "collapsed": true
136 },
137 "outputs": [],
138 "source": [
139 "def dec(reg, machine):\n",
140 " machine[reg] -= 1\n",
141 " machine['pc'] += 1"
142 ]
143 },
144 {
145 "cell_type": "code",
146 "execution_count": 7,
147 "metadata": {
148 "collapsed": true
149 },
150 "outputs": [],
151 "source": [
152 "def jmp(addr, machine):\n",
153 " machine['pc'] += addr"
154 ]
155 },
156 {
157 "cell_type": "code",
158 "execution_count": 8,
159 "metadata": {
160 "collapsed": true
161 },
162 "outputs": [],
163 "source": [
164 "def jpz(reg, addr, machine):\n",
165 " if machine[reg] == 0:\n",
166 " machine['pc'] += addr\n",
167 " else:\n",
168 " machine['pc'] += 1"
169 ]
170 },
171 {
172 "cell_type": "code",
173 "execution_count": 9,
174 "metadata": {
175 "collapsed": true
176 },
177 "outputs": [],
178 "source": [
179 "def set_literal(reg, literal, machine):\n",
180 " machine[reg] = literal\n",
181 " machine['pc'] += 1"
182 ]
183 },
184 {
185 "cell_type": "code",
186 "execution_count": 10,
187 "metadata": {
188 "collapsed": true
189 },
190 "outputs": [],
191 "source": [
192 "def cpy(from_reg, to_reg, machine):\n",
193 " machine[to_reg] = machine[from_reg]\n",
194 " machine['pc'] += 1"
195 ]
196 },
197 {
198 "cell_type": "code",
199 "execution_count": 11,
200 "metadata": {
201 "collapsed": true
202 },
203 "outputs": [],
204 "source": [
205 "instruction_table = {'inc': inc, 'dec': dec, 'jmp': jmp,\n",
206 " 'jpz': jpz, 'set': set_literal, 'cpy': cpy}\n",
207 "numeric_args_table = {'jmp': [0], 'jpz': [1], 'set': [1], 'sto': [1], 'ld': [1]}"
208 ]
209 },
210 {
211 "cell_type": "code",
212 "execution_count": 12,
213 "metadata": {
214 "collapsed": true
215 },
216 "outputs": [],
217 "source": [
218 "def parse(instruction):\n",
219 " words = instruction.split()\n",
220 " instr = words[0]\n",
221 " args = words[1:]\n",
222 " if instr in numeric_args_table:\n",
223 " for p in numeric_args_table[instr]:\n",
224 " args[p] = int(args[p])\n",
225 " return instruction_table[instr], args"
226 ]
227 },
228 {
229 "cell_type": "code",
230 "execution_count": 13,
231 "metadata": {
232 "collapsed": true
233 },
234 "outputs": [],
235 "source": [
236 "def program_from_instructions(prog, machine):\n",
237 " machine['instructions'] = [parse(instr) for instr in prog]"
238 ]
239 },
240 {
241 "cell_type": "markdown",
242 "metadata": {},
243 "source": [
244 "Note that I've added an extra step in the program parsing stage, which is dealing with labels. When you're writing programs like this, it's very easy to lose track of where each instruction is and how large the jumps should be. To get around that, I take a leaf from standard assembly programming of allowing user-defined labels for positions. That allows me to write the example multiplication program like this:\n",
245 "\n",
246 "```\n",
247 " set c 0\n",
248 " cpy a d\n",
249 "loop: jpz b end \n",
250 " dec b\n",
251 " cpy d a\n",
252 "smul: jpz a emul\n",
253 " inc c\n",
254 " dec a\n",
255 " jmp smul\n",
256 "emul: jmp loop \n",
257 " \n",
258 "end: set d 0\n",
259 "```\n",
260 "\n",
261 "where the words at the start of the line are the labels and all the jumps move to particular labels. This is a bit more readable, and also means I can add and remove instructions without having to change all the jump instructions as well.\n",
262 "\n",
263 "The `replace_labels` function makes two passes through the program listing. The first pass finds all the labels and stores their absolute positions. The second pass replaces each label in an instruction with the relative position from the instruction number to the label's position."
264 ]
265 },
266 {
267 "cell_type": "code",
268 "execution_count": 14,
269 "metadata": {
270 "collapsed": true
271 },
272 "outputs": [],
273 "source": [
274 "def unlabel_listing(listing):\n",
275 " labelled_instructions = [i.strip() for i in listing.split('\\n') \n",
276 " if i.strip() \n",
277 " if not i.strip().startswith('#')]\n",
278 " return replace_labels(labelled_instructions) "
279 ]
280 },
281 {
282 "cell_type": "code",
283 "execution_count": 15,
284 "metadata": {
285 "collapsed": true
286 },
287 "outputs": [],
288 "source": [
289 "def program_from_listing(listing, machine):\n",
290 " instructions = unlabel_listing(listing)\n",
291 " program_from_instructions(instructions, machine)"
292 ]
293 },
294 {
295 "cell_type": "code",
296 "execution_count": 16,
297 "metadata": {
298 "collapsed": true
299 },
300 "outputs": [],
301 "source": [
302 "def replace_labels(listing):\n",
303 " locations = {}\n",
304 " for n, i in enumerate(listing):\n",
305 " if ':' in i:\n",
306 " locations[i.split(':')[0]] = n\n",
307 "\n",
308 " unlabelled_listing = []\n",
309 " for n, i in enumerate(listing):\n",
310 " instr = i.split()\n",
311 " if ':' in i:\n",
312 " instr = i.split(':')[1].split()\n",
313 " else:\n",
314 " instr = i.split()\n",
315 " terms = []\n",
316 " for term in instr:\n",
317 " if term in locations:\n",
318 " terms += [str(locations[term] - n)]\n",
319 " else:\n",
320 " terms += [term]\n",
321 " transformed_instr = ' '.join(terms)\n",
322 " unlabelled_listing += [transformed_instr]\n",
323 " \n",
324 " return unlabelled_listing "
325 ]
326 },
327 {
328 "cell_type": "markdown",
329 "metadata": {},
330 "source": [
331 "Note the 'splat operator' trick, which unpacks a list of items into positional parameters of the function. It has the effect of converting\n",
332 "```\n",
333 "cpy(['a', 'd'], machine)\n",
334 "```\n",
335 "into\n",
336 "```\n",
337 "cpy('a', 'd', machine)\n",
338 "```\n",
339 "which is the form I want. \n",
340 "\n",
341 "Another way of doing this could be to have the individual instruction-implementing procedures take a list of parameters, but I think this is clearer to understand."
342 ]
343 },
344 {
345 "cell_type": "code",
346 "execution_count": 17,
347 "metadata": {
348 "collapsed": true
349 },
350 "outputs": [],
351 "source": [
352 "def run(machine, initial_state=None, trace=False):\n",
353 " if initial_state:\n",
354 " machine.update(initial_state)\n",
355 " while machine['pc'] < len(machine['instructions']):\n",
356 " if trace:\n",
357 " print(show_machine(machine))\n",
358 " cmd, args = machine['instructions'][machine['pc']]\n",
359 " cmd(*args, machine)"
360 ]
361 },
362 {
363 "cell_type": "code",
364 "execution_count": 18,
365 "metadata": {
366 "collapsed": true
367 },
368 "outputs": [],
369 "source": [
370 "def execute(listing, initial_state=None, trace=False):\n",
371 " m = new_machine()\n",
372 " program_from_listing(listing, m)\n",
373 " run(m, initial_state=initial_state, trace=trace)\n",
374 " return m"
375 ]
376 },
377 {
378 "cell_type": "code",
379 "execution_count": 22,
380 "metadata": {},
381 "outputs": [
382 {
383 "data": {
384 "text/plain": [
385 "'a: 52, b: 0, c: 0, d: 0, pc: 48'"
386 ]
387 },
388 "execution_count": 22,
389 "metadata": {},
390 "output_type": "execute_result"
391 }
392 ],
393 "source": [
394 "program = open('07-program.txt').read()\n",
395 "show_machine(execute(program, initial_state={'a': 7}))"
396 ]
397 },
398 {
399 "cell_type": "code",
400 "execution_count": 23,
401 "metadata": {},
402 "outputs": [
403 {
404 "data": {
405 "text/plain": [
406 "'a: 250504, b: 0, c: 0, d: 0, pc: 48'"
407 ]
408 },
409 "execution_count": 23,
410 "metadata": {},
411 "output_type": "execute_result"
412 }
413 ],
414 "source": [
415 "program = open('07-program.txt').read()\n",
416 "show_machine(execute(program, initial_state={'a': 937}))"
417 ]
418 },
419 {
420 "cell_type": "code",
421 "execution_count": null,
422 "metadata": {
423 "collapsed": true
424 },
425 "outputs": [],
426 "source": []
427 }
428 ],
429 "metadata": {
430 "kernelspec": {
431 "display_name": "Python 3",
432 "language": "python",
433 "name": "python3"
434 },
435 "language_info": {
436 "codemirror_mode": {
437 "name": "ipython",
438 "version": 3
439 },
440 "file_extension": ".py",
441 "mimetype": "text/x-python",
442 "name": "python",
443 "nbconvert_exporter": "python",
444 "pygments_lexer": "ipython3",
445 "version": "3.5.3"
446 }
447 },
448 "nbformat": 4,
449 "nbformat_minor": 2
450 }