Included frequency histogram diagrams
authorNeil Smith <neil.git@njae.me.uk>
Wed, 12 Mar 2014 12:32:08 +0000 (12:32 +0000)
committerNeil Smith <neil.git@njae.me.uk>
Wed, 12 Mar 2014 12:32:08 +0000 (12:32 +0000)
slides/c1a_frequency_histogram.png [new file with mode: 0644]
slides/caesar-break.html
slides/english_frequency_histogram.png [new file with mode: 0644]

diff --git a/slides/c1a_frequency_histogram.png b/slides/c1a_frequency_histogram.png
new file mode 100644 (file)
index 0000000..7ccbd42
Binary files /dev/null and b/slides/c1a_frequency_histogram.png differ
index d552209419fe6f6e7074499ea7998f4c55a7465c..6df8365ddfe5c8b4f21e812068a1356adb557869 100644 (file)
 
 ---
 
-# Brute force
+# Human vs Machine
 
-How many keys to try?
+Slow but clever vs Dumb but fast
+
+## Human approach
+
+Ciphertext | Plaintext 
+---|---
+![left-aligned Ciphertext frequencies](c1a_frequency_histogram.png) | ![left-aligned English frequencies](english_frequency_histogram.png) 
+
+---
+
+# Human vs machine
+
+## Machine approach
+
+Brute force. 
+
+Try all keys.
+
+* How many keys to try?
 
 ## Basic idea
 
@@ -67,6 +85,7 @@ What steps do we know how to do?
 # How close is it to English?
 
 What does English look like?
+
 * We need a model of English.
 
 How do we define "closeness"?
@@ -104,19 +123,19 @@ The distance between the vectors is how far from English the text is.
 
 Several different distance measures (__metrics__, also called __norms__):
 
-* L<sub>2</sub> norm (Euclidean distance):  `\(|\mathbf{x} - \mathbf{y}| = \sqrt{\sum_i (\mathbf{x}_i - \mathbf{y}_i)^2} \)`
+* L<sub>2</sub> norm (Euclidean distance):  `\(\|\mathbf{x} - \mathbf{y}\| = \sqrt{\sum_i (\mathbf{x}_i - \mathbf{y}_i)^2} \)`
 
-* L<sub>1</sub> norm (Manhattan distance, taxicab distance):  `\(|\mathbf{x} - \mathbf{y}| = \sum_i |\mathbf{x}_i - \mathbf{y}_i| \)`
+* L<sub>1</sub> norm (Manhattan distance, taxicab distance):  `\(\|\mathbf{x} - \mathbf{y}\| = \sum_i |\mathbf{x}_i - \mathbf{y}_i| \)`
 
-* L<sub>3</sub> norm:  `\(|\mathbf{x} - \mathbf{y}| = \sqrt[3]{\sum_i |\mathbf{x}_i - \mathbf{y}_i|^3} \)`
+* L<sub>3</sub> norm:  `\(\|\mathbf{x} - \mathbf{y}\| = \sqrt[3]{\sum_i |\mathbf{x}_i - \mathbf{y}_i|^3} \)`
 
 The higher the power used, the more weight is given to the largest differences in components.
 
 (Extends out to:
 
-* L<sub>0</sub> norm (Hamming distance):  `\(|\mathbf{x} - \mathbf{y}| = \sum_i \left\{\begin{matrix} 1 &amp;\mbox{if}\ \mathbf{x}_i \neq \mathbf{y}_i , \\ 0 &amp;\mbox{if}\ \mathbf{x}_i = \mathbf{y}_i \end{matrix} \right| \)`
+* L<sub>0</sub> norm (Hamming distance):  `\(\|\mathbf{x} - \mathbf{y}\| = \sum_i \left\{\begin{matrix} 1 &amp;\mbox{if}\ \mathbf{x}_i \neq \mathbf{y}_i , \\ 0 &amp;\mbox{if}\ \mathbf{x}_i = \mathbf{y}_i \end{matrix} \right| \)`
 
-* L<sub>&infin;</sub> norm:  `\(|\mathbf{x} - \mathbf{y}| = \max_i{(\mathbf{x}_i - \mathbf{y}_i)} \)`
+* L<sub>&infin;</sub> norm:  `\(\|\mathbf{x} - \mathbf{y}\| = \max_i{(\mathbf{x}_i - \mathbf{y}_i)} \)`
 
 neither of which will be that useful.)
 ---
diff --git a/slides/english_frequency_histogram.png b/slides/english_frequency_histogram.png
new file mode 100644 (file)
index 0000000..3ab182e
Binary files /dev/null and b/slides/english_frequency_histogram.png differ