Bits of tinkering

[cipher-training.git] / slides / caesar-break.html
diff --git a/slides/caesar-break.html b/slides/caesar-break.html

index 7a2fbf6d550cbd8e8dc90bcea3b694c0e4dcd293..090c43f9147340b37e2f6e11ef213c93a1320a39 100644 (file)
--- a/slides/caesar-break.html
+++ b/slides/caesar-break.html
@@ -207,6 +207,8 @@ Text encodings will bite you when you least expect it.
  # Five minutes on StackOverflow later...
  
  ```python
+import unicodedata
+
  def unaccent(text):
      """Remove all accents from letters. 
      It does this by converting the unicode string to decomposed compatibility
@@ -246,8 +248,6 @@ with open('count_1l.txt', 'w') as f:
  
  # Reading letter probabilities
  
-New file: `language_models.py`
-
  1. Load the file `count_1l.txt` into a dict, with letters as keys.
  
  2. Normalise the counts (components of vector sum to 1): `$$ \hat{\mathbf{x}} = \frac{\mathbf{x}}{\| \mathbf{x} \|} = \frac{\mathbf{x}}{ \mathbf{x}_1 + \mathbf{x}_2 + \mathbf{x}_3 + \dots }$$`