From 94d275111f1186159fa592d86c47bfc9defbb536 Mon Sep 17 00:00:00 2001 From: Neil Smith Date: Sun, 1 Jun 2014 12:25:24 +0100 Subject: [PATCH] Finished keyword breaking, started word segmentation --- slides/keyword-break.html | 39 +++++++++++++++-- slides/word-segmentation.html | 81 +++++++++++++++++++++++++++++++++++ 2 files changed, 116 insertions(+), 4 deletions(-) create mode 100644 slides/word-segmentation.html diff --git a/slides/keyword-break.html b/slides/keyword-break.html index b3c0a2c..46dded5 100644 --- a/slides/keyword-break.html +++ b/slides/keyword-break.html @@ -51,7 +51,7 @@ a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | --|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|-- k | e | y | w | o | r | d | a | b | c | f | g | h | i | j | l | m | n | p | q | s | t | u | v | x | z ----- +--- # Duplicate and extend your `affine_break()` function @@ -85,11 +85,11 @@ But before we get there, a couple of diversions... --- -# `map()` +# map() A common task is to apply a function to each item in a sequence, returning a sequence of the results. -```python``` +```python def double(x): return x * 2 @@ -107,11 +107,13 @@ How can we use this for keyword cipher breaking? Define a function that takes a possible key (keyword and cipher type) and returns the key and its fitness. +* (Also pass in the message and the fitness function) + Use `map()` and `max()` to find the best key --- -# `print()` +# Arity of print() How many arguments does this take? @@ -143,6 +145,35 @@ First number goes in `x`, remaining go in the tuple `xs` What does `Pool.starmap()` do? +--- + +```python +from multiprocessing import Pool + +def keyword_break_mp(message, wordlist=keywords, fitness=Pletters, chunksize=500): + helper_args = [??? for word in wordlist] # One tuple for each possible key + with Pool() as pool: + breaks = pool.starmap(keyword_break_worker, helper_args, chunksize) + return max(breaks, key=lambda k: k[1]) + +def keyword_break_worker(???): + ??? + return (key, fitness) +``` + +* Gotcha: the function in `Pool.starmap()` must be defined at the top level + * This is definitely a "feature" + +--- + +# Performance and chunksize + +Try the multiprocessing keyword break. Is it using all the resources? + +Setting `chunksize` is an art. + +## Map-reduce as a general pattern for multiprocessing + diff --git a/slides/word-segmentation.html b/slides/word-segmentation.html new file mode 100644 index 0000000..6eb88e3 --- /dev/null +++ b/slides/word-segmentation.html @@ -0,0 +1,81 @@ + + + + Affine ciphers + + + + + + + + + + + + -- 2.34.1