Tidying formatting
[cipher-training.git] / slides / transposition-break.html
1 <!DOCTYPE html>
2 <html>
3 <head>
4 <title>Breaking transposition ciphers</title>
5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
6 <style type="text/css">
7 /* Slideshow styles */
8 body {
9 font-size: 20px;
10 }
11 h1, h2, h3 {
12 font-weight: 400;
13 margin-bottom: 0;
14 }
15 h1 { font-size: 3em; }
16 h2 { font-size: 2em; }
17 h3 { font-size: 1.6em; }
18 a, a > code {
19 text-decoration: none;
20 }
21 code {
22 -moz-border-radius: 5px;
23 -web-border-radius: 5px;
24 background: #e7e8e2;
25 border-radius: 5px;
26 font-size: 16px;
27 }
28 .plaintext {
29 background: #272822;
30 color: #80ff80;
31 text-shadow: 0 0 20px #333;
32 padding: 2px 5px;
33 }
34 .ciphertext {
35 background: #272822;
36 color: #ff6666;
37 text-shadow: 0 0 20px #333;
38 padding: 2px 5px;
39 }
40 .float-right {
41 float: right;
42 }
43 </style>
44 </head>
45 <body>
46 <textarea id="source">
47
48 # Breaking transposition ciphers
49
50 attack the fort at dawn
51
52 a t t a c
53 k t h e f
54 o r t a t
55 d a w n
56
57 akod ttra aean cft
58
59 Generally quite familiar...
60
61 ## Try all the keys, pick the one that looks most like Englilsh
62
63 ---
64
65 # ...Pick one that looks most like English
66
67 But the naïve Bayes score will always be the same!
68
69 * Same letters, just a different order.
70
71 Score by probability of substrings of letters
72
73 * Bigrams, trigrams, _n_-grams
74
75 ---
76
77 # Finding _n_-grams
78
79 Given `count_2l.txt` and `count_3l.txt`, counts of bigrams and trigrams in English
80
81 # Write a function that returns all the _n_-grams for a text, given _n_
82 * Assume the text is already sanitised
83
84 # Build `P2l`, `P3l` (after `Pl`), `Pbigrams`, `Ptrigrams` (after `Pletters`)
85
86 ---
87
88 # Breaking scytale
89
90 What are the possible keys?
91
92 ---
93
94 # Try all the keys...
95
96 *All* the keys?
97
98 What's the transposition of 'cat'?
99
100 * 'bat'?
101 * 'car'?
102 * 'wry'?
103 * 'babe'?
104 * 'powwow'?
105
106 ---
107
108 # Equivalence classes and canonical forms
109
110 Lots of words yield the same transposition
111
112 * They're all in the same equivalence class
113 * Only need to test one from the class
114
115 General idea: if there are different ways to represent something, pick one to make comparisons easier
116
117 * Canonical form, canonical representation
118
119 ---
120
121 # Finding the transpositions to try
122
123 ```
124 For each word:
125 if it's a new transposition:
126 add it to the list
127 ```
128
129 What data structure to use to store the transpositions?
130
131
132 </textarea>
133 <script src="http://gnab.github.io/remark/downloads/remark-0.6.0.min.js" type="text/javascript">
134 </script>
135
136 <script type="text/javascript"
137 src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&delayStartupUntil=configured"></script>
138
139 <script type="text/javascript">
140 var slideshow = remark.create({ ratio: "16:9" });
141
142 // Setup MathJax
143 MathJax.Hub.Config({
144 tex2jax: {
145 skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
146 }
147 });
148 MathJax.Hub.Queue(function() {
149 $(MathJax.Hub.getAllJax()).map(function(index, elem) {
150 return(elem.SourceElement());
151 }).parent().addClass('has-jax');
152 });
153 MathJax.Hub.Configured();
154 </script>
155 </body>
156 </html>
157