slides/keyword-break.html

   1 <!DOCTYPE html>
   2 <html>
   3   <head>
   4     <title>Breaking keyword ciphers</title>
   5     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
   6     <style type="text/css">
   7       /* Slideshow styles */
   8       body {
   9         font-size: 20px;
  10       }
  11       h1, h2, h3 {
  12         font-weight: 400;
  13         margin-bottom: 0;
  14       }
  15       h1 { font-size: 3em; }
  16       h2 { font-size: 2em; }
  17       h3 { font-size: 1.6em; }
  18       a, a > code {
  19         text-decoration: none;
  20       }
  21       code {
  22         -moz-border-radius: 5px;
  23         -web-border-radius: 5px;
  24         background: #e7e8e2;
  25         border-radius: 5px;
  26         font-size: 16px;
  27       }
  28       .plaintext {
  29         background: #272822;
  30         color: #80ff80;
  31         text-shadow: 0 0 20px #333;
  32         padding: 2px 5px;
  33       }
  34       .ciphertext {
  35         background: #272822;
  36         color: #ff6666;
  37         text-shadow: 0 0 20px #333;
  38         padding: 2px 5px;
  39       }
  40       .indexlink {
  41         position: absolute;
  42         bottom: 1em;
  43         left: 1em;
  44       }
  45        .float-right {
  46         float: right;
  47       }
  48     </style>
  49   </head>
  50   <body>
  51     <textarea id="source">
  52
  53 # Breaking keyword ciphers
  54
  55 a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
  56 --|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|--
  57 k | e | y | w | o | r | d | a | b | c | f | g | h | i | j | l | m | n | p | q | s | t | u | v | x | z
  58
  59 ---
  60
  61 layout: true
  62
  63 .indexlink[[Index](index.html)]
  64
  65 ---
  66
  67 # Duplicate and extend your `affine_break()` function
  68
  69 * How to cycle through all the keys? What _are_ all the keys?
  70
  71 * Look at `words.txt`
  72
  73 ---
  74
  75 # Test it.
  76
  77 * `2013/4a.ciphertext`
  78 * `2013/4b.ciphertext`
  79
  80 This will take a while. Fire up a system monitor. What's wrong?
  81
  82 ---
  83
  84 # Python, threads, and the GIL
  85
  86 Thread-safe shared-memory code is hard.
  87
  88 Python's Global Interpreter Lock prevents shooting yourself in the foot.
  89
  90 Where you want true parallelism, need different threads (Python processes).
  91
  92 * Thread-safe shared-memory code is hard.
  93
  94 The `multiprocessing` library makes this easier.
  95
  96 But before we get there, a couple of diversions...
  97
  98 ---
  99
 100 # DRYing code
 101
 102 Three cipher breaking tasks so far.
 103
 104 All working on the same principle:
 105
 106 ```
 107 find a way to enumerate all the possible keys
 108 initialise 'best so far'
 109 for each key:
 110     decipher message with this key
 111     score it
 112     if it's better than the best so far:
 113         update best so far
 114 ```
 115
 116 Repetition of code is a bad smell.
 117
 118 Separate out
 119
 120 * enumerate the keys
 121 * score a key
 122 * find the key with the best score
 123
 124 ---
 125
 126 # map()
 127
 128 A common task is to apply a function to each item in a sequence, returning a sequence of the results.
 129
 130 ```python
 131 def double(x):
 132     return x * 2
 133
 134 >>> map(double, [1,2,3])
 135 [2,4,6]
 136 ```
 137
 138 * `map()` is a higher-order function: its first argument is the function that's applied.
 139
 140 How can we use this for keyword cipher breaking?
 141
 142 ---
 143
 144 # Mapping keyword decipherings
 145
 146 Define a function that takes a possible key (keyword and cipher type) and returns the key and its fitness.
 147
 148 * (Also pass in the message and the fitness function)
 149
 150 Use `map()` and `max()` to find the best key
 151
 152 ---
 153
 154 # Arity of print()
 155
 156 How many arguments does this take?
 157
 158 How do you write a function that takes this many arguments?
 159
 160 ---
 161
 162 # Function arguments
 163
 164 ## Positional, keyword
 165
 166 * Common or garden parameters, as you're used to.
 167 * `def keyword_encipher(message, keyword, Keyword_wrap_alphabet.from_a):`
 168
 169 ## Excess positional
 170 * `def mean(x, *xs):`
 171
 172 First number goes in `x`, remaining go in the tuple `xs`
 173
 174 ## Excess keyword
 175
 176 * `def myfunc(arg1=0, **kwargs):`
 177
 178 `kwargs` will be a Dict of the remaining keywords (not `arg1`)
 179
 180 ---
 181
 182 # Back to `multiprocessing`
 183
 184 What does `Pool.starmap()` do?
 185
 186 ---
 187
 188 ```python
 189 from multiprocessing import Pool
 190
 191 def keyword_break_mp(message, wordlist=keywords, fitness=Pletters, chunksize=500):
 192     helper_args = [??? for word in wordlist] # One tuple for each possible key
 193     with Pool() as pool:
 194         breaks = pool.starmap(keyword_break_worker, helper_args, chunksize)
 195         return max(breaks, key=lambda k: k[1])
 196
 197 def keyword_break_worker(???):
 198     ???
 199     return (key, fitness)
 200 ```
 201
 202 * Gotcha: the function in `Pool.starmap()` must be defined at the top level
 203     * This is definitely a "feature"
 204
 205 ---
 206
 207 # Performance and chunksize
 208
 209 Try the multiprocessing keyword break. Is it using all the resources?
 210
 211 Setting `chunksize` is an art.
 212
 213 ## Map-reduce as a general pattern for multiprocessing
 214
 215     </textarea>
 216     <script src="http://gnab.github.io/remark/downloads/remark-0.6.0.min.js" type="text/javascript">
 217     </script>
 218
 219     <script type="text/javascript"
 220       src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&delayStartupUntil=configured"></script>
 221
 222     <script type="text/javascript">
 223       var slideshow = remark.create({ ratio: "16:9" });
 224
 225       // Setup MathJax
 226       MathJax.Hub.Config({
 227         tex2jax: {
 228         skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
 229         }
 230       });
 231       MathJax.Hub.Queue(function() {
 232         $(MathJax.Hub.getAllJax()).map(function(index, elem) {
 233             return(elem.SourceElement());
 234         }).parent().addClass('has-jax');
 235       });
 236       MathJax.Hub.Configured();
 237     </script>
 238   </body>
 239 </html>