From 94d275111f1186159fa592d86c47bfc9defbb536 Mon Sep 17 00:00:00 2001 From: Neil Smith <neil.git@njae.me.uk> Date: Sun, 1 Jun 2014 12:25:24 +0100 Subject: [PATCH] Finished keyword breaking, started word segmentation --- slides/keyword-break.html | 39 +++++++++++++++-- slides/word-segmentation.html | 81 +++++++++++++++++++++++++++++++++++ 2 files changed, 116 insertions(+), 4 deletions(-) create mode 100644 slides/word-segmentation.html diff --git a/slides/keyword-break.html b/slides/keyword-break.html index b3c0a2c..46dded5 100644 --- a/slides/keyword-break.html +++ b/slides/keyword-break.html @@ -51,7 +51,7 @@ a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | --|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|-- k | e | y | w | o | r | d | a | b | c | f | g | h | i | j | l | m | n | p | q | s | t | u | v | x | z ----- +--- # Duplicate and extend your `affine_break()` function @@ -85,11 +85,11 @@ But before we get there, a couple of diversions... --- -# `map()` +# map() A common task is to apply a function to each item in a sequence, returning a sequence of the results. -```python``` +```python def double(x): return x * 2 @@ -107,11 +107,13 @@ How can we use this for keyword cipher breaking? Define a function that takes a possible key (keyword and cipher type) and returns the key and its fitness. +* (Also pass in the message and the fitness function) + Use `map()` and `max()` to find the best key --- -# `print()` +# Arity of print() How many arguments does this take? @@ -143,6 +145,35 @@ First number goes in `x`, remaining go in the tuple `xs` What does `Pool.starmap()` do? +--- + +```python +from multiprocessing import Pool + +def keyword_break_mp(message, wordlist=keywords, fitness=Pletters, chunksize=500): + helper_args = [??? for word in wordlist] # One tuple for each possible key + with Pool() as pool: + breaks = pool.starmap(keyword_break_worker, helper_args, chunksize) + return max(breaks, key=lambda k: k[1]) + +def keyword_break_worker(???): + ??? + return (key, fitness) +``` + +* Gotcha: the function in `Pool.starmap()` must be defined at the top level + * This is definitely a "feature" + +--- + +# Performance and chunksize + +Try the multiprocessing keyword break. Is it using all the resources? + +Setting `chunksize` is an art. + +## Map-reduce as a general pattern for multiprocessing + </textarea> <script src="http://gnab.github.io/remark/downloads/remark-0.6.0.min.js" type="text/javascript"> </script> diff --git a/slides/word-segmentation.html b/slides/word-segmentation.html new file mode 100644 index 0000000..6eb88e3 --- /dev/null +++ b/slides/word-segmentation.html @@ -0,0 +1,81 @@ +<!DOCTYPE html> +<html> + <head> + <title>Affine ciphers</title> + <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> + <style type="text/css"> + /* Slideshow styles */ + body { + font-size: 20px; + } + h1, h2, h3 { + font-weight: 400; + margin-bottom: 0; + } + h1 { font-size: 3em; } + h2 { font-size: 2em; } + h3 { font-size: 1.6em; } + a, a > code { + text-decoration: none; + } + code { + -moz-border-radius: 5px; + -web-border-radius: 5px; + background: #e7e8e2; + border-radius: 5px; + font-size: 16px; + } + .plaintext { + background: #272822; + color: #80ff80; + text-shadow: 0 0 20px #333; + padding: 2px 5px; + } + .ciphertext { + background: #272822; + color: #ff6666; + text-shadow: 0 0 20px #333; + padding: 2px 5px; + } + .float-right { + float: right; + } + </style> + </head> + <body> + <textarea id="source"> + +# Word segmentation + +a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z +--|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|-- +k | e | y | w | o | r | d | a | b | c | f | g | h | i | j | l | m | n | p | q | s | t | u | v | x | z + +---- + + + </textarea> + <script src="http://gnab.github.io/remark/downloads/remark-0.6.0.min.js" type="text/javascript"> + </script> + + <script type="text/javascript" + src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&delayStartupUntil=configured"></script> + + <script type="text/javascript"> + var slideshow = remark.create({ ratio: "16:9" }); + + // Setup MathJax + MathJax.Hub.Config({ + tex2jax: { + skipTags: ['script', 'noscript', 'style', 'textarea', 'pre'] + } + }); + MathJax.Hub.Queue(function() { + $(MathJax.Hub.getAllJax()).map(function(index, elem) { + return(elem.SourceElement()); + }).parent().addClass('has-jax'); + }); + MathJax.Hub.Configured(); + </script> + </body> +</html> -- 2.43.0