X-Git-Url: https://git.njae.me.uk/?a=blobdiff_plain;f=slides%2Fword-segmentation.html;fp=slides%2Fword-segmentation.html;h=6215255ca3c4825937d0d1177e4a54b64bce6b23;hb=995a501e53864ff95b984e846966162d851ee9b9;hp=35721ab3fea6fc8529a893cb9d98f06ca8eb7b8d;hpb=d811998d5948d7ce4e801b9deeec0d6d6dd37553;p=cipher-training.git diff --git a/slides/word-segmentation.html b/slides/word-segmentation.html index 35721ab..6215255 100644 --- a/slides/word-segmentation.html +++ b/slides/word-segmentation.html @@ -129,7 +129,7 @@ Constructor (`__init__`) takes a data file, does all the adding up and taking lo ```python class Pdist(dict): def __init__(self, data=[]): - for key, count in data2: + for key, count in data: ... self.total = ... def __missing__(self, key): @@ -177,9 +177,9 @@ To segment a string: return the split with highest score ``` -Indexing pulls out letters. `'sometext'[0]` = 's' ; `'keyword'[3]` = 'e' ; `'keyword'[-1]` = 't' +Indexing pulls out letters. `'sometext'[0]` = 's' ; `'sometext'[3]` = 'e' ; `'sometext'[-1]` = 't' -Slices pulls out substrings. `'keyword'[1:4]` = 'ome' ; `'keyword'[:3]` = 'som' ; `'keyword'[5:]` = 'ext' +Slices pulls out substrings. `'sometext'[1:4]` = 'ome' ; `'sometext'[:3]` = 'som' ; `'sometext'[5:]` = 'ext' `range()` will sweep across the string