Added Python idiom links, added fancy web font for quote
[cipher-training.git] / slides / transposition-break.html
1 <!DOCTYPE html>
2 <html>
3 <head>
4 <title>Breaking transposition ciphers</title>
5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
6 <style type="text/css">
7 /* Slideshow styles */
8 body {
9 font-size: 20px;
10 }
11 h1, h2, h3 {
12 font-weight: 400;
13 margin-bottom: 0;
14 }
15 h1 { font-size: 3em; }
16 h2 { font-size: 2em; }
17 h3 { font-size: 1.6em; }
18 a, a > code {
19 text-decoration: none;
20 }
21 code {
22 -moz-border-radius: 5px;
23 -web-border-radius: 5px;
24 background: #e7e8e2;
25 border-radius: 5px;
26 font-size: 16px;
27 }
28 .plaintext {
29 background: #272822;
30 color: #80ff80;
31 text-shadow: 0 0 20px #333;
32 padding: 2px 5px;
33 }
34 .ciphertext {
35 background: #272822;
36 color: #ff6666;
37 text-shadow: 0 0 20px #333;
38 padding: 2px 5px;
39 }
40 .indexlink {
41 position: absolute;
42 bottom: 1em;
43 left: 1em;
44 }
45 .float-right {
46 float: right;
47 }
48 </style>
49 </head>
50 <body>
51 <textarea id="source">
52
53 # Breaking transposition ciphers
54
55 attack the fort at dawn
56
57 a t t a c
58 k t h e f
59 o r t a t
60 d a w n
61
62 akod ttra aean cft
63
64 Generally quite familiar...
65
66 ## Try all the keys, pick the one that looks most like Englilsh
67
68 ---
69
70 layout: true
71
72 .indexlink[[Index](index.html)]
73
74 ---
75
76 # ...Pick one that looks most like English
77
78 But the naïve Bayes score will always be the same!
79
80 * Same letters, just a different order.
81
82 Score by probability of substrings of letters
83
84 * Bigrams, trigrams, _n_-grams
85
86 ---
87
88 # Finding _n_-grams
89
90 Given `count_2l.txt` and `count_3l.txt`, counts of bigrams and trigrams in English
91
92 # Write a function that returns all the _n_-grams for a text, given _n_
93 * Assume the text is already sanitised
94
95 # Build `P2l`, `P3l` (after `Pl`), `Pbigrams`, `Ptrigrams` (after `Pletters`)
96
97 ---
98
99 # Breaking scytale
100
101 What are the possible keys?
102
103 ---
104
105 # Try all the keys...
106
107 *All* the keys?
108
109 What's the transposition of 'cat'?
110
111 * 'bat'?
112 * 'car'?
113 * 'wry'?
114 * 'babe'?
115 * 'powwow'?
116
117 ---
118
119 # Equivalence classes and canonical forms
120
121 Lots of words yield the same transposition
122
123 * They're all in the same equivalence class
124 * Only need to test one from the class
125
126 General idea: if there are different ways to represent something, pick one to make comparisons easier
127
128 * Canonical form, canonical representation
129
130 ---
131
132 # Finding the transpositions to try
133
134 ```
135 For each word:
136 if it's a new transposition:
137 add it to the list
138 ```
139
140 What data structure to use to store the transpositions?
141
142
143 </textarea>
144 <script src="http://gnab.github.io/remark/downloads/remark-0.6.0.min.js" type="text/javascript">
145 </script>
146
147 <script type="text/javascript"
148 src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&delayStartupUntil=configured"></script>
149
150 <script type="text/javascript">
151 var slideshow = remark.create({ ratio: "16:9" });
152
153 // Setup MathJax
154 MathJax.Hub.Config({
155 tex2jax: {
156 skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
157 }
158 });
159 MathJax.Hub.Queue(function() {
160 $(MathJax.Hub.getAllJax()).map(function(index, elem) {
161 return(elem.SourceElement());
162 }).parent().addClass('has-jax');
163 });
164 MathJax.Hub.Configured();
165 </script>
166 </body>
167 </html>
168