4 "cell_type": "markdown",
7 "# Punctuation in novels\n",
8 "Inspired by [Punctuation in novels](https://medium.com/@neuroecology/punctuation-in-novels-8f316d542ec4#.qwj8e1n8m).\n",
10 "Texts used here are [The complete works of Sherlock Holmes](sherlock.txt), [War and Peace](war-and-peace.txt), [The complete works of Shakespeare](shakespeare.txt), [Ulysses](ulysses.txt), and [Pride and Prejudice](pride-and-prejudice.txt)."
15 "execution_count": 67,
22 "import collections\n",
23 "from PIL import Image, ImageDraw\n",
24 "from math import ceil\n",
25 "import matplotlib as mpl\n",
26 "import matplotlib.pyplot as plt\n",
27 "%matplotlib inline\n",
32 "cell_type": "markdown",
35 "The `string` module has some nice subsets of characters. Does it know about punctuation?"
40 "execution_count": 68,
48 "'!\"#$%&\\'()*+,-./:;<=>?@[\\\\]^_`{|}~'"
51 "execution_count": 68,
53 "output_type": "execute_result"
61 "cell_type": "markdown",
64 "## Getting the punctuation\n",
65 "First, let's just open a text file and read the punctuation. We can also count the number of different punctuation characters in it."
70 "execution_count": 69,
76 "sherlock = open('sherlock-holmes.txt').read()"
81 "execution_count": 70,
88 "output_type": "stream",
90 "..-.......'.........,,,.,,,.,.-'..,-,.,,...,-,,,,,,,,.,,,,.:,,.,,,.-,-(,.-,,,,.,,,,.,,.,,..-...;,,.,,,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\",.,\"\"\",\".,.,'.,,,,,\",.\"\";\",,..,,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\"\"\"\",\"\"\"\"?'\"\"!...,,.--,,,\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,..\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".'''''''''''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',..\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.-,-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\"\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\".\"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,!-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\",!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\"\"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\"\",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\".',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\"\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\".\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,.,,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,,.'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''.',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\"\"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\"...,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\",,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\",,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\"\"\"\"\"\"\"...\"-..,,,.,,,-,.,,,,.;,.,-,,.,,;,.\"\".\"\".\",,'\".\"'\"\"'\".\"''.,,.,'\"\"'.,\"\".-..,\",.,,.,,..,-.,,..,..\"\",-,,--\"\".,.,',..\",\".\".\"\"\"\"\"\"\"\",\"\"\"\"\",.\".,,.\"\"\"\",,.,,,.\"\"\"\"..?-,.,.,,,\"\"\"\"-.,.,.,.;-.-.....-.,-.,,,,.,,,.;,.-\"\"\".\".-,,.-,.\"\"\"\".,.,,.\".:\"-,\",.\"'\",.\",\".,.\"\",.\"\"\"\"\"\"\",.\"\"\"\",\"\"\".\"\".\"!\"\"\"\"\"\"\"\"\"\"\"\"\"\".,.,'\"\".-!!\",.,..\".,\".\".\",.\"!.:\"\"\",.\"\"\"\"\"\".\"\"\"\"-,.,,.-,,,,.,\",.\".:\".--..,.,,....,,.,,.,,...,,,,-,,.\",,..,,-,.\",;-.,...,.;,.,\",\",\"\"-,\",.\"??\"\"\".\"'\"\",\";\"..\"\"\"\"..-\".\"\".\"\"\"\".\"'\".\",\"\"..-\",,,.,.'.,.,,..-,.,,-,.,.\",\".\"\"\".\"\"\"\".,.,,\",-.\"\",,.\",,.,,,\"\"\".\",,,\"\"\"\",,,.,.,.,,,.,.,.,...,,.,,\".,,,,.,,.,,,.',--,,,.-.,,,.',.\",,,,,\".,,.\",-,,.\".\",,.','\"\",..,\"\",,,\"\"',,,--\"\",,.\"\",\"\",\"\"..'.,,-,\"..\",\".\",,\"\",\",\".'',,,.,.\"\",\".\"..,\":\"-:,,,..,.--,.,',,,',\"\"\".,.\",'\".\",.,,,.,,\"\",.\"\".,.\"\",,.\",;\"',.',.,;\"\"\".\",',.'.,.;.,,,\"\",?..'\"\",,\"..\".,.,'.'\"\",\"\",.,-',.,,;,.\".,,,:',.,-'''.'',''-.',,'.,''',,'.,.,--,,-.',.'-'.''',,'''.',,,''\",,-,.''.'',,',.,,,.-,-;,.''','-'''.',.,.,,,,,,.,,.,;'\",,,,,.,.,.\",..,,,.-,''..-,,,,-,,;,,-.,;.,,.,,;\"\"\".\"\"\",.,.,.,,,.'.',''''.'.',,.,,.''.',,,',.''.'.,.''.,-.'','.,.,.?'\".\".'','!.,,-.'\",.,;.'',',.'',,'.',,.'.''''.''\"',.,,-;.,,.''.''''''''''',,,.,..'''','.''.;'.,''''\"\".,,-,.-'''.',-,.,',,.\",,;,.,'',.,,,-,,'.\",,.,..,;.'-,,.\",.,'.,.,..,,,.,,,,,,.\",,'.,.\"\"\"\",..',,.,\"-.:-.,.,.\"\",.\",\"\",\",.\"'..,,.\"\",..,.,,-,-.....'','.'',-''''','..'''',..,,.''\",.,-,..\"\"\".\"-,...,..,,,\"\"\".\",.\"\"\"..\",\"\"\",\".,,,,.\"\",.,,--.,\"\".,,,..-\"\"\"\"\"\"\"\"\"\",\"\"\"\"\"\"-,\"\"\"\",\"\",-,,,'.\".\"\".\"\"\",.\"\"\",.\"\"\",,;\"\"\"\",.'\"\",...-,\"\",\",\"\"\"\".\"\"\"\",\".,,.\"\",\".\"\".\",'\",-,.,,.\".'\".\",?\"\"-.\"\".,.,.,.\";-,.,,-,--,--.\"\",,-.,.,,.',,,.-,-,.\"\",\"\"\",\",.\",\".\",,,.\"\"\",\".'-.\"\"\"\"\"\"\"\"\"\"\"\"\"\",,.'.-.\"-..,..\"\",,\"..',,,,,'-..,,','.,-,,-\",.,,,,-,,-,-,.,,,.;,,,,-.,,..'.\",,\".\",\"\".\"\"\"\"..-.-\"\"\"\"\"\"\"\".,,,,\",,.,.,,,.,-\"\"-,.,?,?-'-.,.-,.,.,,,,,-,-.\"!\",-.\",.,?.,-'\"\"',,\".\".\"\"\"..\".,\".\",,',,.,,\"\",,.,\".\",.--\"\"\",\"-,.,.,,;,,\"\",,,,.',.,,.',.,.,,.',.'\"\"-.'.,.,,.,\".-.\"\".\",..,...,\".,,.,,.,.,,..,,-,,,,.\"\".\"\".,.\",,\",.\"\".\".,\".,,,,.,.\"\",\".,.,--..,\"\"\".\"\"\"\"\".,.,.,.,\"\"\".\"...,\"\"\"\".,,,.'.,,.,,.,.,,.,,\",,.-.,'.,,,.\"\".\"-.,\"\"\"\".\"!,.,;,,,-,....,,,,,,.,,.,.,,.,,,,,.-,,,,,--,.,,.\"'\".\"?!,,,'\".,.,'',.\"',\".\"\"\"\".\",-\"\"\".\",!.\"\"\".\"-\"\"'\".\"'.\"\"\".\".,,''''\"\"\".\",,,,-\"\"\"..\",.\".,\".\"\".\".\",,,-\"\",\",\",''-.,,,.''.,,?,,,.,\"\"\"\",.,,.',,.,,.?',.!..-.,?.\"....,,,.,...,,...,',.,\"\"-\".\",.'-,.,,.,.-\"\"\".\",\"\"\",.\"!..\"\"\"..\",,,\".\"'''-'''\".\"\",\"..,,,,,,-,,,,\"\"\".\",,,.,,,\"\"\".\",,,,.,\".\"\".\",,,.---\"..''',.,,,,,,,.\"\",\",.\",,,.,,,,,-.,,\",..\"\",\".\"\"\",.\",,\"\"\".\",.,,.,,.,,,.,,.,,,,\"-.,,-.,,,.,,,,.\"\",.\".,..,.,,.\",,--.,,,.\"\",\"\"\"\",\"\",,,,-.\"',.\",\"\"\"\",;\"..,\"\",,.,.,.,.',,,.\"\"\",-..\",\",\".-,-.,,,,,\"\"\",\",,\"\",.,,,\"\"\"\",,.',.,',.,,.,;.,,.,'\",,,.\"\",\"\"\",,..,/.,\"\"\".\",,.\"\",.,',.,..,.,\"\"\".\",.,..\"',.\"'\".\",,....-.,;?,'.,.,,,,,.,,.\"\"\",\".\"\",,.,,,,\"\".',,.\"\",.,,-,.,,,.\"\"\"\",'.',.,,,\"\".?\"\",,.,.,\"\"\"\",,...-.---\"\"\"\"',.,'\"\",\"\"\"\"'\"\"-\"\",\"\",.,,,',,.,.,\"\"\".\"..\"\",..,...',,,,.,,,,\"\",.,,\"\"..,,.,..,,;,,.',..,;',,,\"\",\"\",;\"\"!.,,.\"\",,..',',..,-,.,-,,!,.,.,\"\"\".\",,!.,,,;,,.-,\"\".,,\"\",.,.\"\"\"\"\"\".\"\",\"\"?\"\";,,,.,,?,,,,.,??,-,'\".\"\",,\".,.,.,\"\"''\"\"\"\"\"\".\"\"'\".\"\"\".\"\".,\"\".',.'\"\"&,\"\"..,.,\"\",.,..\",.,.,,.,,,,,-,.\",\".\",,,.,,',.,,.\"\"\".\",.,.,-,-.,'?\"\",-,-,.,,.,,..'.,,--,,-\".',,..,,.,,.'..,,.,,.-,,,,,,.,,-,,\"\"\"\",,.,,;-,.,,,.,,,,-,\"\"\",,,'.\",,.,..,.,,,..\".\"\",\",...;,,,,,-;,.,,-,,,,-..-\"\"\".\"\",,\"..,.,,\"\"\".\",.''.,,,.-,\"\"\"\",\"\"\"\",.,,.,',.,'..,,,\"'.,;,,.,,.,.',.,,,,.-,,.\",\".\".\"\",,\".\",!.,,,.,,\"\",,\",,.\"',.\".\".\",-,,-,-,,,.,-,.\"-,.\".\",'\"\",.,,.,.,,,,.,,,.,,\"\"\";\".\"..\"\".\"\",\"'.,.,.,,.,'''',\"\",\",.\",.\".\"....,'''''',,,\"..\",.\".\",,\"\"\",.\",,\"\"!\".,.\",'-'\".\",..,.'!\",.\"-'\".\".,,.,,\",,.,,,,,.\"\",\".,,..,,-,,,.,,,?..,,.?.,,,,',.,\"\"\".\"\"\".,,,,.',..,.,,'...'.,.,,,,.,,,-.,.\",.\",,.\",\",.,\"\",,\",,\".,.\",',\",',-\",,,..\"'-\",,.\",.,,\"\"\".\",.,,,.,,.,,.,,,.,,\"\"\"\",...-,,,,.,.,.&,,,.\"\"\"\".,',',\".,,.:\"?...:\"\",\",.\"\"\"'.\"\",..,.'\"\",\".\",,\".,,.,,--.\",\".\",..\".,-,..\"\".\".\"\"..,,\"\"\"\"..,.,,\"\",\"\",..,,.\",..,.,,.,-.,.,,,,.,,,.,-.-.,-...\",,,,.-,,..\"-,.,,,-....-.,.,.,,..\",-,.,.,,-,.,,.,..,.,..,-.,,...-',.,'',,.-\"\"\".\"\"\"\".\",,.,,,.,,,,,,,.,,,-\"\"\",\"\"\"\",.\",..,.,-,.-\"\"-\"\",..,;,,...,.,\"\"\"\",.,,.'\"\"\".\",\"\",\"\"\",\".,.,,,,.,-.,,,,,.-\".\"\".\".\"\"'\"\",,,.,\",..:\".,,:',,.,,.,,,,.,.,-,;..\"\".,..,.,.,,,.,.,.,.'-,,.,.,,,,.'\":?\":,.\":?\":..\":?\":.\":.\":..\":..\":.\":''?\":.\":,,,?\"(:.\":?\":.\":?\":,..,,.,.''','''','''''''''''','\"\"\"\",\".,,,,'.,,\".\"\",\"'.'?,;,,.,,,.,.,\"',,,-.,-,-,.-,,..\"\".\",\"\"\".\"\".\"\".\"?-,.,.,.-\".\",,\".\",.,,',,.,,.,!\".,,,.\",.\",,,',,\"..'.,,..,;-.\"\",\".\"\"\".?,?\"\"\"\",\",.\"!\".\"\".\".!..,\"\"\".\"....;,--,.,,,\"\"\".\"\"\",..\",.\"\".\"-\"\"'\"\"\"\",?,.,...\"\"!!\"\",\"\";-,,,.\"\",\"\",.\"\"-..,,.,\"\",\"\",,.-,\",.\",\"'.\"?-,\"\"\".\"\"\",\"\".-\"\"\"\".,,\",,,-.,,,,.',,,,,?.??,.'...,.,,.,'..?..,.?..,,,.!',''..,.\"\".\".,,.\"\"\"\"\"\"\"\".,.-,,,\"\"\",\"\"\",.,,,,,,-,?,,.,,.,,,,.,...,,,,,.\"\",\"\"!?.,,,.''..,,-\",,.',.\"\".\".,,\"\",\".\";,..',,,,\"\"!\".\",!.\"\"!,,,',,,,,?,..\"\"\",.\",,\"\"\";\"\"\",\".\"-\"\"\"\",\",.\"\"\",\",-,-,-,.,,,.,,',,',.,-,....,.,,,,.,,,,,.,.,,,,.,,.,,.,-,..,'.,.,,,,.,,.,,.\"\".\"..-\"\",,!!.,.,.-,.\",.\"'.,,...'.,?-.?,!?!!,,!,,-.\",,,..,,.,.,.\"\",.\".,.,.,\",.\",\",.\"\"\"\"\"\"\",\"\"....\"\"\"\",-,,--,,-,-.,\".\"\".\",-\"\"\".\",.,\"\"\"\",\"\"\"\"\"\",\"\"\"\"\"\".\".\"\",\"-.-\"\"\".\"..-.\",,.,.\",\"\".',.\"\"\"\",,',.,,''..,,'.,\"\"''\"\",.,,..''.'',.\"\",\".\"\".\"\".\"\"\"\".\"\".\"\"\".,..,\"\"\".\".,,.,',.\"\"\"\",,\"\"\"\"-.,\"\"\"\".\"\".,,\"\",\"\"\"\"..?-\"\"-\"\".,.,-?..,.,,,,,.,.,\"\"-\"\"..,,,-\"\"\",\",..-\"\".\",-,..,,,-,,.,,,,,..\"\".\"\"\",-.\"\"\"\"\",.\"\",.\".\".\"\".\".\"\"\".\".-\"\"\".\"\"\".,.,\"\"\".\"..\".\"\".\".,..\"\"'\";\"',,.;,.\"',....,.'.\"''.,-,;,,,,.,,,,.,.\",.,,.,,.-,.,,,.,,..,,,.,,..,..\",.',',;''.',,.'-',-,''\",,,.,,;,,.,.,,,,,..\",,,,,..;,,.....\",....?.,.!.,...,...;,.,,\"\",\".\"\"\",.\"\",..,.,;,,\"\",\".\",,\",.\"\".\",?',,',,'\".,;..'',.,,,,.,,,,,,,.,,,,.',.,,,\",,.,,,',,-.,.,.,,-,.,,.-,'-,.',.\"\",,\".-?,\"\"\".\"\"\",\"\",..'\",,..\"\".,--,-,.,.,,.\"\",-.\".\"\"\".\".-,\"\",\"\"\"\"\"\"\"\"\"\"\"\",..\"\",.\"\"\"\"\"\"\"\"-,\"\"\"\"\"\"\"\"\"\"\"\".\"\",,,,\"\"\".\",\".\"\",\",,,.;,.\"-.,.,.\",.',,.,.,.,,.,-,-,.,.,,.,,.\"';,,.,..,,.,.,,,-,,.',.\"-,-'.,,.'','!',,.,.,,,,'..!',,',,'','.'',,.,,..?-,,,,,,.',''.'-,,'\",.,,,,.,,.',','.,,,,,,.,!,,,.-,'..'\",.,,,.,.,.,.,,,,,,.,,,.,,,.\",,.,,.,,-,.,,,,'',,.,,,,,\"\"\",\",,.,\"\",.,\"\".\"\",,,,.,.,...,',,,'.,,.,-'..,,-.\",','.-.,.--,.',,'.\".'..,'.\".''.'.''',.'?'.'.';'''',.','.''',.''.'?'''.'.'''',.''\",.,,.\",,.,.,,.,.-,,.,.,,,,-,'',.,,,.,-.\".?,',.\",',,.,,.,;\",.\"\".\"-.':'..';''\"\"\".\"\"\"\"\"--,-\"..,,\"\"!\".\",,..\"\"\"\"\"\".,,,\".\"\".\",,,\"\"-\"\".\".\"\",\",,\"\".-\"\"..,,-\"\"\".,,,-,.\"\",\",.,,,,.,..'\",,.,\",\":\"...\".,,,..\"..\"..\"..\"\"\",.\"..\"\"\"\"...,..,,.\"\"\"\",,.;,..\"\"\",.\".\"\".,,,.\"\"\"\".,.\"\"\"\".-\"\",\"\",.\"\",,.\"..,---.,.,-.\",\",\"\"\",,\"\",.,,.\"\"\",\"\"\"\".\"?..,\",-.\"\",\",,.,,...,,,;,,,,,.,,,.,,,\"\"\",.\".,,,.,-,,,,-,,,,-.,,\".\"\",\",,-,-,.,-,.''...,..,.,.\"\",,\"\".\"\".\"\"..--..,,.\"\"\"\"\"\"\"\"-....,,.-\"\"\"\"..,..,\"\"\".\",\"\"-..'.,.,....\"\"\"\"-\",-\"\"\"\".\"\":'..-,,,,,,.,.--,.,.,,.,,,..,,,'\"\",,\"...,\"\"-\"\".,,',,--,'.,,,,,.,,,,.-,-\",..\"\";\",,'\"\"\".\".,\"\"\"\",.\",..\"\",\"\"\"\",,\".\",.\",'':\"-,,,.,,,,-,.,,,,-,.,,,.,-.,,-\",.\",\".\",,.,,,.,-\",.\"\".\"?.,,,,.,,.\"\"\"\";.,\",..',.,,.\"\".\"..\"\"\"\".\"\"\"\"\"\"\"\"..,,-.\"\"\",..\"...\"\",'',\"\"\",.\".\"\"\"\".,\"\",\",.\"\",\"','.-.,,'',,,\"\",\"\";\"\"\"\",'',',.\"\"\"\"''.,.,\"\",\"\",.,,-..,,..--,\",,,,,..\"\",.-,\".\",\"\".,,.,.',.,,;',.,,,,.,,,,-,,.-,'-,.,-.\"\".\"'\",.,,.,,-,,.\"\",,-,,',.\",'\";\"\"\"\",,\".,!\"\"',\".-.\".,,.\"\",,!',.'..\"',,..??.,,.,,,.--,,,.,,,.?,,?,.?,,?',...-,,,...--,,.,,;-,,,,.,,,,,,-.,,,.,,,,,,,.,-,,,,.,,.\".\".\",.,\",,,,,,.\"!'\".,.\",,'\"\"\"\"\"\",\"\"!..'\".\",..\"\".',,,,-.'.'-.!\"\",\"\"..,..\",,,.,,\",\"..,,,,,,..-..,,,,,.,,,,-.\"\",\"\"\"\";\".\"\"\"\".,..,\"',,.,,;,.,',,.,..,,.\",\",\"-,\"\"\"\"\"\"\"\"\"\"\"\";,,,.,,,,.';,.-,',\"\"!\"\",,..-,..\"-,'.\",\",-,.\"','\"\"\"\",;.-\"\"\"\";..'.\"\",\"\",.-\"\"\"\".'..,;.'.-,..,\",,,,.,,,.,.,,,,,.,,,,.\",\".\".',,-.-\"\"\"\".,,.',,'.,',,\"\",\"\"-,,-,.,.,,.,,.,,:...-,,,,.,,,.,,.\"..,,.,,,,.,,,,-...,,,',,:.\"\"\"\",,..,,.,,,,-.,,.,.,,.\",---.,,,,,,.,,-,,.,,....,,,,..,..,,.'..\",,.,.-,.,..,...,.,,,-.,...,,.\".,,..',',.,,,'.\".,..,.,.,-,,,.,-,.,.,,.,,,,,,,,,,,-.,\"\"\".\"-\"\";-.,,\"\"\"\"..,,.,,,.,,,,.,,--,-,,,,......',.,,-,.\",-..',.,.\"\"\"\",'.---....\"\".\"\",,..,.?-.,,,.,,..,,'.,,\"\"\"\",.,,,.,.,-.,,,-.\",,.,,,.\"\".\",,,.?,,,'\"\"\".\"...,.,,..,,\".-',,,-.,,,,.,,-,,,,.\"\",\"\",,.\"\"\"\"\"\"\"\"\"..,\"\",..,\"\"\",.\",,,\"\"\",\",.,,\"\",.\"--,,\",\"\",\"\".,.,\"\"\"\",\".\",\",-.\",,,\"\"\"\"\"\"\"\"'.\"\"\"\"\"\",.,-\".\"\".\",-\",.\"\"\"\",..,,.\"\".\"',\"\",\"\"\"\"\"\",,,.,-.,,.,,.,,,..!\"\",.-\"\"'\"\"\"\"\"\".,\"'...--'-,,-.!-.!,,.',\"\".\"\"-.,..,,\"\",.\"\".,,.\"\",;,\"\".,,-\"\"\"\",\"\",,....,-.\"\"..,\"\".\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\",,\"\"\"\",\"\".\"\".\"\"\"\"\"\"\"\".\"\",,\"\".,,\"\".\"\"\"\"\"\"\"\"\"\",....,-\"-,,.,,,,,,,,,.-.,-,.,-,.,,,,,,-.,,.,,,.\",\".\"\"\"\"\"\"\".,-,\",,...-..\"\",.\",,..\"\"\",.\"\".\",,\",.\",,.,,\",.,--.,.,,.\"\",.\",\".,.,.'.\"\".\",\"\",,\",-,.\",\"\",..\",-,,..\",.\"\",-..,\"\".\"\".\"\"\"\"\"\",.\"\"\"\",,'.,,;,,\"\"\"\"?..\"\",'\"\".,\",,,.\"\".\"\".\"\".\"\".,,.-,,.,,,.,,..\"','\".\"\".\",\",,,-.\"!!\".\",,\"\",'\".\"',\",.,.-,,'.\"\",\"..,,\".'.!,,,!,,,,-,-,--,.,.\"\",\",,.\".\"\".\"\"\".--,,'\".\",-,\"\"..,,,,\"\",\".\"\"\";\".\",.!!\".\"\",\".,,.,,.\"\"\".\",,,.\".,.,,.,...,,,-..,-.,,,-.,..\",,.',.',,..\",,,.,,,..,-.,,,.\",.--,,.,,,.\",,,...\",,,.,,,,,.,.,,.'.,.,.,.,.,,,,,,..,.\".,.,,,\"\"\".\"!\"\"\",\".,\"\"\",;\".\"\";\"\",\".\",\"\"\"\".,.,.,.\"\"\",\".,,\".,.-,-,,,.,-,,..\"\";\"\"\"..---\"\",,.\"\",\",,-\"\",.\",.\".,,.\"\"\",\",\"\".,,.,.,\"\"\"\"\"\"\"\",,...,,.,,,,'.:',,,,,.,,,,.,.',,,.;,,-,,,.,,\"\"\"\",.'.'','.',,,\"\",,\"\",.,,,.,,,\"\"\"\"\"\",\"\"\"\"\"\"\"\".\"\"..\".,.,.';,,\".\".-,.,,,,.\"\",.\",,.,,.\"\",\".\"\",\",.,--,.,,,,,,,.\"\"\"\",,-\",.\",,,-,,-..,,\"\",\"\".,,\"\",.,\"..\"\";\"\"\",\"\"....,,\"\",,.\".\"\"-.\".,,.,,.,,-\"\"\"\",-,,,-,.-,.,-.,,,,,,,\"\"-\"\".,,',,'\"\"\"\",-.'\"\".\"\",,;,-.,--.\"\",\",;\",,,,\",,,,.\",.!,\".\"?,?\"'.\",!\",,..\",\",\".\"\",?.\"\"'.\"\"'\".\".,.,,\"\"!\".\",\"\",,\".\",,.,,'-..,\",,,,:\".,,,,-.,-,-,.,.,,,,,,-.,;.,,',,.,,,,.,,.,,\"\"!-\",.\"-.,,.;,.,..,,.,\"\"\"\".,:',..:,'\"\".\"\",,,,.,.,,,.,,\"\",\"\",,,,.',,,,\"\",.\"\",,..,,,,\",.\"'\".\"...'...,.,.,-,,-.?'\"\"\"\"\"\",,,,\"\",,,.,,\"\"\"\"\"\".,\"\"..,.,,.\",-..,'.\".,\",.\",..,.,,.,.\"\",,\",,,,.,,'.-,,.,,-.\"\",\".\".\"\".\".\"\".,,\"\"\".\",.,,\"\",,\"..\",,,,,-\".\"\",\".,,,,\".\",,\".\",?,\"\",\",.\",-,.,,,,,.,.,,\".\".\".\".,\"\"\"\"\"\"\",.,,-.',,,.,-.-,-.\"\".\"\".\"..,\"\"!,.,,'\"\"!,\"\",\"\"?.\"\"\"\"!'.,',.-\"\".\",.\",,'.;,,,..,,\",,.,-,-.\"-.'\"..\",\",.\"-\"\"'\"\",-\"\",\"\"\"\"\"\",;\"\",.\".\",,\",,\"?',\"\".\"\",'.\"\",;'\"\"!',,.;''''''',\"\",\".\"',.',\"\",,','\".\"'\"\"\"\"'\"\"',?,\"\"'\"\",\"\"',.',\".\",\".-,.\",.\",\",'.\"\"\"\"'.'?,,,.,!?,.,.\"\".,,-\".\".\".\",'.,,,'\"\",,'\"'.-.'\"\"..\"'.,'\"\"\".,.-,.\"''',\".\",.,,,,,.-,-.,-\".-,,,,.\"'\".\".'..',?\"\";\".\",,.\"\"\"\",,.'.\",.\"!\".\",\",.,-.\",?\".\"\",\".\"\"??\"\".'\"\"\"\",..,,,.,,,.\"\",,\".\"\"-.\"--\".\",,\".\"\".\",;\".\"\".\"\",\"\"\"..,\"-,-,.,-.,,,,.\"\".\".,..-..,!\"\",\"\",,.,-,\".\",\",\"\"\"\"\"\"\",.'.-,.\".-,,,,-.,.\"',\".\",,'!,.'..!.,\",,.\",,.,.,,'\"\"\".\"-'-.,,;.,,.,,,.,?'--.,,-,,.-\"'.\"',\".\"!!.!..'.,'!','\"\"\".\",\"\",..,.\"\"!..,?,\".\",\".\",,..,,'.,,.;,,.,;..\",,.,,.,;,,..?.,.,.\",.,.,-,,.,,.,.,..','.'','',''',''-',.'.'-,,,'',';','''','''.',''.',',.','','',.'\",,.,.,.,.,,.,',..','.'',''''',''',''',;-,'\",,,;,.-.,...-,.!\",.,'-..\"\".\",!,\"\".\".,,,.\",\",,\".;,.,.;.,-.,.,.,,,,\".,,,,;,,,.,,-.,.,,.,..',,.,,-,,,.\",\",\"'..,,\"\",-\"\";.,.-.,,,.,,,.,,\"\",\",,,.-.,,.\"-,\".\".,.,.!..,,\"\"\",.\",\"\",..\",,,,.,,.,-.\"\",.\",.,\"\",\"\",.,-,,\".\",\",.\"..-,-\"\",\".\",,.,;.-,,,,,.,.;.,..,,,,?,,,\",,-,.\"\".\",;.,.,,.,;,.\"\"\",\",,,.,.,.,.\"\",\"\",,,,\".\"\".\",,.,,,.,---,.,;,,,,,,,.,,,.,.\".,.,-,.,'-.--.,.-...,.\".,,.,',,.,-,,,,.\",.,-,,.,,,.\".,.,,\"\",\"\",.,,.,,,',,,'.,-,.;,\",.\"\".\",.-,,,.,-..',',.,.\"\"\"\"..,,.,,,,.',.',',''''.',,''.'',,,.,.-,.'',.''.,'','',,',,\"\"\".\"\"\"\"\"\"\".\"\".\"\"..,,,..,.,,.'.,,.,,,.,',.-,.-,,,.,.,.,,',!!!',',.,,-.',,,.\"\"\",\"?\"\".,,,\"\"\"\",-.,-\"\"..\"\",.',.,-,.,,,.,.,,.,\"\"\"\",\"\",\"\",\"\"\"\",\"\",-\"\",,.\".\"\";\"\"\",.,,,,.--.,,.,.,,,.,,,,,.,.,,,,-,,,\"\"\".\"\"\",\"\",.\"\",\"'.,,.\"\"..\"\",\"\",.\"\".\"..-,\"\",-.,.,,\"\".,\"\"\"\".\"\".',\"\"..\"\",..\".\",\",.\"\"\"\"\",,,,\"\",,,\"\"\"\",,',,,,,,\"\",,\"\"\"\"\"\"..,.\",.,-,-,,-.,.,,,,,-,-,,,,.\"\".\",;\".\".,\"\",\".\"\"\"...\"\"\".\"\".\"\".\"!,\",-.\",!.,\".\",\".\",--\".\"\".\",\"\".'..!!\",,.\"\",.\"\",.\",\",,.\"!,,.,,,',\"'.,.\"\".\".,',,,.,.,,,,.',.,,,;,..'..-,,\",.,.,..,,,,.,,,,.\"\".,.-.\"\".\",,.\".\"\";\"\"\"'\",;\",',-.,\"\",,\",.\",\",,.\"\",\",..-,.\".\"\",.\"..,\"\"'\",..\"\",\",\"\"\"\".\"\",.-.,'.,,\",-,,,.,,.,-,,,.,-,.-.\",,,',.'\"\".\"\",.,\"\".\"\"!.,.,\"\",.\"\",.,\",,,,..,,.\"\",,\"..,\".,,,.,,-.,-,--.,-,.,-,.,,.\"\"-,.\"'\"\"\"\",\"\",\"\",.\"\",-.\",.-..-.\",'\".\"'\"\",..\"\"!\"\"\",.\".,,,,\"\"\".\"-\".\",\"\"--,.,,\".'-,.-,,,,,,..\"'\",.\"'\"\"!,\"\",.\"\"',\"\".\"\",\".\";'.\"\",,!,,,.\".\".\",.\"!\".,,.\",\"\"'.'\"\",?,!',.,,\"'.,.\",\",\"\"\"\"\".\"\"\"\",\".\",..\"\",\"\".\"\"\"\",,.,,,,.,,\"\",,\"\"\"\"\"\",\"\",.,\",'.\"\"\",','\"\"\"\",\"\",..,,..-,,,\"-.,,..,.,'.,-.\",\",\"-.\"\"\"\"\"\"\"\"\"\".\"\",.\"\"-,\"\",\"\",.\"\"\"\"\"\",,..'.,.,'.\"\"\"\",.,,.\"\"\"\"\"\"\"\".\"\"\"\".-,-\"\"\",\".\"\"....,,,.;'\"',.,,,,.\"\",;\"\",,.,,.,.,,,,.\"\";\"\"...\"\".\"\".,;.,',.,,..,:\"\".\".\".\";..,\".,..,.?,,-,,.,.-,,.,.,!,,.,,.-.,,..-,,.,,,-.\",\".\"\".,,.,,..,.,,.,,,.\"\".\"\".\",,.,.'\"..,,..-,,.,,.-,,..,.,,..\"!\"..,-.\"\";\"..,,,.,\"-',',',,..,.,,..\"\",\",,.,'',,,.,,.,,,-.,,.,,..,,.-,,...,,.,.,,.\".,.,,..,.,\"\"\"\".,..',\".',.,,-.','.,,.,,,,,-,.,.',,.,.,,.,,.,',-.,,.,,,.\"'\",;\"'\"\",\",.\"'\".\"';'.,.,;,\",,.-..,.,--,,,;,.\",\",\".,,.,-\".\".,,,(.\",,.\"\",-.\",,\"\",\",.,,,..\"\";\"\".,..,-.\"\".\".\",.\"'\".\",,,\"..,..\"\",\".\"\",.,.,\"\"!\"\",,\"\"\",,\"\"\"\".\",\"\"\"\"!\"\"\"\"\",,,.,.\"\".\"!,.,\"\".\"\",,.;,,,,,;,,\"\"\",\",,.,\"\",\",\",.\"\"'.'\"\"\"\"'..\"\";\"\",\",,,.,,--,-,,.,,.,,.\",.\".\",.,\"\"\",\",.,\",-,,.\"\",\",.,&,-,.,,',.\"..,..,,,,.\",,,.,,''.,,..,.,,,,.,,,.'.',.',.,'\",.''',.''';'.''..,-,,''',''\",.','.','',,?,''''',..''',.''.'.\",..',';'',.'''.'''','...''''.-'''',.,.:'''''','',.''..'',-''.''.,,.,,,'\",.'','.,,''...''''.'-,''''--.'-.,,,-,,.-.,,.,,.,.,,,.,,,,,,-.-.'''.''-,,,''','.,..,.,'.',,:''''',,,,,.\",,,.,,,,.,,'-,.,,,,,.\".,,-'.,.,,.,.,-,\"\"\".\",\"\"\"\",-.\"\"-\"\",\"\"..\"\",.,,,,.,,,.,.,,.,,.,,-,.,,,.,,,.,.\",.,,.,,,.,.,.,,,,.'',.,,,,..'',.\",,.,-,,,..,..,,---??,,,,,.,,,,,.,,.,-.\",,.,,.,.,,,,.'',,,;'..'',','.'''.';',,,.'',''\",.-,,.?,?,,.,,,.,.,,.\",..''.',.''',''\".','.'.'','',,'','-'',...'\",,.,,,,,.,,,.,,,.,-.\",.,,.,.'',',.,.,.,,.'\",.,.,,,,.--.,,.,.'-,.,,..''.\".''-';''\".,.'','',,.,,.''.'!!'\"....,,,,,.,,,.,.,,..,.,;,?,.\",.,,..,-.,,,.\",,,..'!'.'..,-,'\",,..,,.'.,.''.','\",'.,,.,.,.-,.,;.'!','..!,'',',.'..,',,,.,,.,,.\";,.,,,.,,,,.,,-.\".,,.,-.',.,,.,,,,.,.\",..,,...?,.-?.\",..,,.\"..\"\".\".:',,.,-,.',.',.!,\"\"\".\"\"\".,,--.,,\",.,,,,-,..\"\".\"..,,\"\"'\"\"\"\".,,\"\"\",\".'\"\".\"\",\".\",,\"\"\".\",\",\"!,,.,\"\"\".\"\"-.\"\"\"\",\",\"\"\",;\"'..\"\"\"\"'\"\",,.\".\"\"\"-\".\"...\"\",\".\"\"\"\".\",\"\"\".\"-.,,.,,\",..\"\".\",\"-.\"\"\",,,\"\"\"\".'\"\"\",\".,,,\"-.\",,.,'-.,,,,\"-.,,,-.\"'\",.\"-,-.\"\",\",\".-,,,.,\"',,,.,,'.,-.,,,,,,,.-,,.,.,.,,,.\"\",\"!-,\"\"\",.\",;\"..,,.,--.,,,,.,,.,,,.-,,,,'.\"\".\",,--\"\",\",,\".\".\",,,\"\",\"\",\"\"\"\"\"\",\"\",,,.,,.,\"\"\",.\"\"\",....\"\",,\"\"...,.:'.--.,,..,,,-,.',,,,.,.'\",,\".\"'..\"\",,.,\"-.\"\",.\"'.,'!':,.''-,.-.,,.,.!,.,\"\"\",\",.,,\"\",.-,,.\"\".,,,:'',',,,.,,,.,,,..'\"\"\",,.\".,:',-....,',,.,,,,.,.,'\"\"\",.\",;.,.',,,..-,-,',.\"\"\".\"\"\",\"\"\"\".;,\"\";\"\",;.\"\"\"\"\"\",,.,':'..,,;.,.',.',,,,.,,,.(,..,,.,,,...,,,.,,,,.,,.,.,,,,.,,.,,,,'\"\"\"\",\"\"-\"\",,.,.,-\"\"..,,,.,,,\"\".\"-,.,,,-,,,-.,,.,,-,.,,,-,,,-,-.,,.\"-,.\",.\"-.,..,\"\",,...,,\"\",\"\"\"\"\"\",!.\"\"\"\"!\"\"\",\"\"\"!!!'.,\"\".,.-,,\"..\",,\"\".\"\"\"\"\"\",\"\"\"\"\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..,,\"\",'-'\".\",.\",\".,.,,,.-,..,---\".-\"\"\"\"\".,,,....\",,\"\",.,,\"\",,\"\".\"\",,,\"\"\"\".\"\"\"\"\"\".\"\"!.\"\"-\"\"\"\",,.,\"\",\"\",..,.',,.,;,,\"\"!.,\"\",.\"\"'\"\",;,-..\"\".,,.-'\"\"\"\"\"\".\"\"\"\"..,,\"\"\"\",.\"\"\"\".''.\"\".\"\"-\"\"\"\",..,,,,.\"\",,,,',,\"\".,,.'\"\",.,\"..\"-..,,,..,-.,,,..',,,,,,.\"\"\"\",,\"\"\"\"..,,.\"\",\"\",\"\".\"\"\"\",.\"\",,,..,,,,\"\",\"\",,-,-\"\",\",.\",.,.-\"\"\"\"..\"\"\",.\"\"\"?\"\"\"\",,\"\"\"..\"\",,-.\".\",.\"-.\"\"\"\",,,..,,'\"\"\"\",,-.,-.-,,!-,!,\"-,,..\"',\".\"\"\"...\"\"!\"\"?.\"\"\"-.\",\"\"',\"\".\".\"\".\"?\"\"\".\"\".\",,\"\",!\"\"\"\"\",-,',.\"\",-.\",\"\",\",.\"\"\".-.,\"\",'.\"\"\"\"\"\",,\".\",,..\"\"\"\".-.-.\".\":'....'.,,,,.,,\"\",\",.\".\",,.\"\".\"!\"\".\".\"\",\"'\"\",\"\"?'!\"\",\"\"'.\".'.,,.,,.,,'\"\".,.,,,\"\"'\",.\".-,.,\",,.\",\";\"...,,\".,,,..\"\",\",,\"',,'.,,,-.,,.,,,.'.,.\",\",.\".\"\",\".\"..!\",,.\",\".\",.\"\"\"..\"\",\"\"\".\"\",\"\".,.,\"\",,\".,.\",\"\".,\"\"\".\",.,.,\".\".\"\"...,,\",-,.,.,,.\"',\".\",\"\"\"..\",,;,'.'\"\",.,\"\"\",\"'.,\",,,-,.\"'\".\"',',,.,;,.;','.',;,.,.,,..'',',';'',,,,;,.\",,.','.,.'.,..',,,,.\",.,'.',..,,.;,,.'.,.'.?,,.,.,..,.\",,.,..,..'..,,..--.,,.,,',,,\"\"\".\"\"\",,-,,.-.-,,.-,,.,,,,..,.,,,,\".,.\"\",\"\"\"'?'\"\",,\".\"\",\"\"\"\".\",.-\".\"\".\",.,--\"\"\",\".,,,.\"\",\"\",,,.,,.?,.,?,.,,.,?;.,,..'.,',,,--'-.,-\"\"\"\",.,,,\"\"\"\"...,.,,,,.;,,..,,,\"\"\".\"\"\",\",,\",,,...,\".\"\"-,\".\"-,.,,,.,--.,,.,.,,,,,.,-,,,--.,,,.,,.\"\".\"\"\"\",.\"\"\";..!\",,,.,,,.,.,,.-,,,.\",\".\".,\",.,,.\"\".\"\".\"--,.,.;,,.,.\"\",\",\"\"\"\",\".,&,\".,,?,,.\"\";\"-.,.,.,.\",,.,.,,,.\".,-,-,,.,,,.'.',''''.'',',.,,,.''',,'.',,.''','.,,,,,''',,.''''.'',,,-,.'-','..'\".''.'.-''....'''',.,,.,,.,,.,,,'\",,,.,,,.,,..,,.\".',?,!,,,.,,.-.\",.,.,.-.,,-,.,,..,.\"..,.,.-.......,.\",.,,,,.,,,.,,.,,,.\",,,,.,-,,,,.,,.,,,,'.\".;,.-,,,,....,,.,;,!-!\",.,,.\"-,,,.,,,,;.,.''.''',''.''.',.-'\",.,,.',',''','.'''',','','.','.'.,,'\",.'',.\",,.-.,.',',,,,',,-''''.,'',.'','',-',.\",.,,\"\",\"\".,,,.,,.,..,,.,,-.'','!!'\",,,,,.,...,,.'',.'!!'''.',',.'.'.'..?''','.,.'''-.''''.','\",.,,',,,.-.,,,.,,..'','.'','.,,;..'';',.,,''',....;,.,,,....,!,,.,\",.,.\"\".\"'..,\"\"\"\"..\"\"\"\".,.--\"\",,\"\"!\"\"'\"\"\"\".\"\",\"\",,\"\"!..?,\"\".,?..\"\"'\"\"!.,,\"\"\"\"\"\",.\"\",\",\"?;..,,,-,,,,,-,,-.,\"\"\".\",\"\"\";\",,.,,\",,.','.,,.,.,.-,.-,-,.,,'.,,,,.,,',..-..,,,.'.,,,.,,-.,.\",,\".\",,,\"\".'.\"\",,\"\"?\"\",\"\",.,,..\"\"-,!.,\"\"\",.\",..\"\"\".\"?!,,.,,,,\"\",,,\",.\".\"\",,\"\"\"\",.,\"\".\"\"\"\"\"\"\"\"?\"\",-,'\"\".,\"\"\",\"\"\",...,.,\"\";.,,\"\"\"\",!-.\"\"\",\"-,\"\",\"\"\"'.\",\".\"\",',.\"\".\".\",..\"\".'-,,,..\"\".\"--\"\"\"\"-\".\"\".\".,,.\",.',-.,.\",.\",\".\".\"\".\"\",.\"\";\",,..,,.?.\"\".\"\".,\"\"'\"\"\"\"\"\".,.,.,\",,.,.\",.\";\"\"\",..\"\"\".\"\".\"?\"\"\"\",',\"\"-.,,\"\"\"\"..-;\"',.,,...,,,,,.\"\",.\",,'.,---,.\",,.,,-..\"\".\"\"\"\"\",..'\"\"\"\",..,.,.,,\".,.,,.,,.,.,,.\",\",\"\"\",\".\".\",,.,,,.,.\"\".\",...,,\"\"\"\".,,.,,..:'--,.,,,.,;,,,-.,--'\",.?\"\",,..,.,\"\"!!,.;!\"\"\"\"\"\"..,.-?.\".,,..\"\".\"!\",.\",.\".\"\".\",\"\",.,,,\"\"\"\",-,\"\"!\"\".,,,.,,\"\"',,,\"\",.,,:.\"\"?\"\";..-,,..,,.,\"\",,\".\",,.,,,,.,..,.,,.,'-,.\",,.,,,-.,.,-.,,,,.,,,.\".,-.,,,,,,.,,,,.,.,,,,,,\"\"\".\"..,,\"\"\"..\",!!!.\"\"\",\".,.',.,,,,.,,,-,.,,.,;.\",..,,,.,.,.,,,,,.,.,,.\",,,,,...;;;;;,.,..,?\",,,.,,.,?.,,-.,,.\"?,?,....,,.\",.',,,,,,-.\"\"-\"..\"..,.,,..,,.,-.,,..-..',',''',.,.,,','\"\"\",.\",,...,.\"\"\",\".,,,\".\"\",,\".,,,,,,,\"\"\",,\"\"\",\",--\"\"\"\",'.\",\",,,.\",-....\",.-,,.,.,,,.\"\",,\",,,,.,,,,.,\"\"\",\"\"\",,,,,!,,,,.,,.,-.,.-,.\".,:\".--.--.,\"\"\"\".\"\"\"-\"\",\"\".,,.,\"\",.,,,\".,,,',.\",\",,\",,\"\",.\".,,-,.\"\",\",,,,.,,.,'.\"-',.,.,,,.\",,.,..'';'.!'.-.','.','''','''''',,!-',.''',,'.',,,-'','.'.,?.,.,,?,,'\",.,,,.,,,-.'',,','\".,,.','.'..,.,,-'',?''-.,!!!!'.\"','.',',''',,,,'.',,,.,''''.,.,--.,.''',.',,'','''\".,.,,...''.,.''.'',',,,'.''',,'.',;.,.,,'\",.''.','',,,'.'.-,',.\",.,,,.,,..,?.,..:\"',.'--,.,.,,.,..,,,(,,,.,,,.,,,,.,,.,-..,'\",.,.,,\"\",,,\",.\"\"\"\"\",.\"\",..\"\",..,-.,,\"\"-,,.\"\",.,\"\",,-..,?\"\".\"\",...-\"\"!\".\"\".\",,\"\"\".\"..,-,-\"-.\"\",,\"\"\"\".\"\"'.,-.,,,,,,.,,,.\"!!\".\"'\".-,-.,,,.\"\",..\"-\".\"!'.\"\"\",.\"\"\",\"\"-\",.\":\"\".,\"'.,.,,.,,'.,,-.\"\"..\",\",\".,.,\"\"\".\"\"\".,,,\"\"\"\"..,',,,.,,.,,,,,.,..,\"\".\"\".\"\",?\"\",..,,\",,.-,.\"\".\";.\"\"\"\",,..,\"\"\".\",,,-....,\"\"\"\"..,.-.,,,,,,.,,,.,'..\",,.,.,...,-,,,,-.,,...,'.\".....,.,,.,.,..,-.,..,,,.,.,,,\"\"\",\"\"\".,,.,.,,,,,.,..,..,,.\";,..',',,',,..-.,'\".,,....,.-,,,.,.,.,,.,..,,,,,,.,.,.\".,,,.-,,,.,,,,.\",.,.,,,.,.-,,.,,,....,,,,.,.,,...,..'',''','.',''!!'''',,.'\",..,,,\"\"\".\"\"\",,.,..,.''.,.''\",.''',.'',.,,,.,,.,.'',''\",'.,.,,..,,,,,...\".,,,.,,.,,.,.,.,.,..\".,..?,,.,.?,.,.\",,.,.,,.,.,,,.,,,.,,..\",,.,,..,,.,.'',',.'\".'',','\",,.''.'.,!.?',.,.\",.,,.,.-.';'.,,.\".,.,,.,;....,,.,,.\",,.,.,,,.,,,.,.,..,.,,..,-.,,.,.'',,',.'','.'!'--','\"...''.'.,''',.','.'''''.'.'-'',,.--,-'''\"...,...,,,,...,...,,,,.,,,..,',..,,.,.,,,,\".,,.\"\".\"..\"\".-\"\"\"\"\"\",-\"\",.?\"\".\"\"',.,,,..,.,,\"\"\"\"!..,..,,,,.,,,,.,,,,..--,',,,,,...\"\"\".\",.'..',',,,,\"\",.\".\".,\"\",.'.,\",,-.,,-.\"\"..\".\".\".,.'\"\"\".\",\",,,..,.,'.\"\".\",,.,,,\"...,,.,.\"\";\"'\"\"\"\".\".\",\",\"'.\"\"\";\"\"\"...,,\",,.,.\"\",\"'\",.\"\",\"!!,?.'\".\"'\".\"\".\"\",.,,..\"\".\".'.,,'\",.,',.,,.,,.-,,.,,.\".\".\",...,,',\"\"\",.\".\"\",,,\"\",,,\"\"\";\"''.'-,','.\",',.,.'.,,,,.'.;,,.,,.',-,'.,,;',\"\"\",\",..,,\"\",\"\".\"\",\"\".,,,,,\"\".-,-\"..\",\"\",,\"\",.\",\"..,,,,\"..,,.,'..,,,.,,,,,.\n"
95 "sherlock_punct = [c for c in sherlock if c in string.punctuation]\n",
96 "print(''.join(sherlock_punct))"
101 "execution_count": 71,
109 "Counter({'!': 171,\n",
123 "execution_count": 71,
125 "output_type": "execute_result"
129 "sherlock_counts = collections.Counter(sherlock_punct)\n",
135 "execution_count": 72,
158 "execution_count": 72,
160 "output_type": "execute_result"
164 "sherlock_ps = pd.Series(sherlock_counts)\n",
165 "sherlock_ps.sort_index(inplace=True)\n",
171 "execution_count": 73,
179 "<matplotlib.axes._subplots.AxesSubplot at 0x7ff442b36d30>"
182 "execution_count": 73,
184 "output_type": "execute_result"
188 "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAD9CAYAAABZVQdHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFPhJREFUeJzt3W2MXNdh3vH/I7FKKJkNwbpdkZRSC/Ey0rZ2I7MVnSaF\nxk3KbvoiEgVCUkgFoqaNCkwjt2iDLo22XH9xJfeVRkEVrZ1omVoM2DgR5JqmuWI4TeEg2tqRbcY0\nTbLIqt51dpXaceQ0aUNGTz/MoTjaDndnlzOj3cPnBwxw7rnn3nMuZ/js2XN3ZmSbiIio021v9gAi\nIqJ/EvIRERVLyEdEVCwhHxFRsYR8RETFEvIRERVbMuQlHZL0FUnnJD0r6bskbZI0KemipNOSNi5o\nf0nSBUk72+q3l3NcknSkXxcUERHXLRrykt4GvB94l+13ALcD+4AxYNL2NuBM2UbSCLAXGAFGgaOS\nVE73NHDA9jAwLGm051cTERFvsNRM/lXgCnCnpHXAncA3gEeAidJmAthdyruA47av2J4GLgM7JG0G\nNtieKu2OtR0TERF9smjI2/4W8K+A/0kr3L9texIYsj1fms0DQ6W8BZhpO8UMsLVD/Wypj4iIPlpq\nueb7gL8PvI1WUL9F0t9ub+PW5yLksxEiIlahdUvs//PAr9r+JoCkXwR+EJiTdLftubIU80ppPwvc\n23b8PbRm8LOl3F4/26lDSfmBERGxTLbVqX6pNfkLwLslrS83UH8UOA98Cthf2uwHnivl54F9ku6Q\ndB8wDEzZngNelbSjnOextmM6DXZZj8OHDy/7mJt5pL/0l/5ujf7WyrUtZtGZvO0vSToGfB54Dfh1\n4D8AG4ATkg4A08Ce0v68pBPlB8FV4KCvj+Ag8AywHjhp+9SiI4uIiJu21HINtj8CfGRB9bdozeo7\ntf8w8OEO9V8A3rGCMUZExApV8Y7XRqOR/tJf+kt/a7qvfvWnpdZzBk2SV9uYIiJWM0l4hTdeIyJi\nDUvIR0RULCEfEVGxhHxERMUS8hERFUvIR0RULCEfEVGxhHxERMUS8hERFUvIR0RULCEfEVGxhHxE\nRMUS8hERFUvIR0RULCEfEVGxhHxERMUS8hERFVsy5CV9v6SX2h6/K+kJSZskTUq6KOm0pI1txxyS\ndEnSBUk72+q3SzpX9h3p10VFRETLkiFv+2u2H7T9ILAd+H3gl4AxYNL2NuBM2UbSCLAXGAFGgaOS\nrn0t1dPAAdvDwLCk0V5fUEREXLfc5ZofBS7b/jrwCDBR6ieA3aW8Czhu+4rtaeAysEPSZmCD7anS\n7ljbMRER0Qfrltl+H3C8lIdsz5fyPDBUyluAX2s7ZgbYClwp5WtmS31Ez13/5XH58kXyUZOuZ/KS\n7gD+JvCfF+5z639F/mfEKuMVPCLqspyZ/I8BX7D922V7XtLdtufKUswrpX4WuLftuHtozeBnS7m9\nfrZTR+Pj46+XG40GjUZjGcOMiKhbs9mk2Wx21Vbd/moq6eeBz9ieKNsfAb5p+ylJY8BG22Plxuuz\nwEO0lmNeAN5u25JeBJ4ApoBPAx+1fWpBP86vy3GzWss1K3kdKcs1seZIwnbHNcquQl7SXcDLwH22\nv1PqNgEngO8FpoE9tr9d9n0QeC9wFfiA7c+W+u3AM8B64KTtJzr0lZCPm5aQj1vJTYf8ICXkoxcS\n8nErWSzk847XiIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKJeQj\nIiqWkI+IqFhCPiKiYgn5iIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKJeQjIiqWkI+IqFhXIS9po6Rf\nkPRVSecl7ZC0SdKkpIuSTkva2Nb+kKRLki5I2tlWv13SubLvSD8uKCIirut2Jn8EOGn7AeCdwAVg\nDJi0vQ04U7aRNALsBUaAUeCoWt+qDPA0cMD2MDAsabRnVxIREf+fJUNe0vcAf8n2zwDYvmr7d4FH\ngInSbALYXcq7gOO2r9ieBi4DOyRtBjbYnirtjrUdExERfdDNTP4+4Lcl/aykX5f0HyXdBQzZni9t\n5oGhUt4CzLQdPwNs7VA/W+ojIqJPugn5dcC7gKO23wX8b8rSzDW2Dbj3w4uIiJuxros2M8CM7f9e\ntn8BOATMSbrb9lxZinml7J8F7m07/p5yjtlSbq+f7dTh+Pj46+VGo0Gj0ehimBERt4Zms0mz2eyq\nrVqT8CUaSb8CvM/2RUnjwJ1l1zdtPyVpDNhoe6zceH0WeIjWcswLwNttW9KLwBPAFPBp4KO2Ty3o\ny92MKWIxrXv9K3kdibz+Yq2RhG112tfNTB7gp4BPSLoD+B/A3wFuB05IOgBMA3sAbJ+XdAI4D1wF\nDral9kHgGWA9rb/WeUPAR0REb3U1kx+kzOSjFzKTj1vJYjP5vOM1IqJiCfmIiIol5CMiKpaQj4io\nWEI+IqJiCfmIiIol5CMiKpaQj4ioWEI+IqJiCfmIiIol5CMiKpaQj4ioWEI+IqJiCfmIiIol5CMi\nKpaQj4ioWEI+IqJiCfmIiIol5CMiKtZVyEualvRlSS9Jmip1myRNSroo6bSkjW3tD0m6JOmCpJ1t\n9dslnSv7jvT+ciIiol23M3kDDdsP2n6o1I0Bk7a3AWfKNpJGgL3ACDAKHFXrW5UBngYO2B4GhiWN\n9ug6IiKig+Us1yz8JvBHgIlSngB2l/Iu4LjtK7angcvADkmbgQ22p0q7Y23HREREHyxnJv+CpM9L\nen+pG7I9X8rzwFApbwFm2o6dAbZ2qJ8t9RER0Sfrumz3Q7Z/S9KfBCYlXWjfaduS3KtBjY+Pv15u\nNBo0Go1enToiYs1rNps0m82u2speXjZLOgz8HvB+Wuv0c2Up5qzt+yWNAdh+srQ/BRwGXi5tHij1\njwIP2358wfm93DFFLNS6DbSS15HI6y/WGknYXrikDnSxXCPpTkkbSvkuYCdwDnge2F+a7QeeK+Xn\ngX2S7pB0HzAMTNmeA16VtKPciH2s7ZiIiOiDbpZrhoBfKn8gsw74hO3Tkj4PnJB0AJgG9gDYPi/p\nBHAeuAocbJuaHwSeAdYDJ22f6uG1RETEAsterum3LNdEL2S5Jm4liy3XdHvj9ZZz/U/7ly8hEf2W\n12d0KyG/qJXNBCMGI6/PWFo+uyYiomIJ+YiIiiXkIyIqlpCPiKhYQj4iomIJ+YiIiiXkIyIqlpCP\niKhYQj4iomIJ+YiIiiXkIyIqlpCPiKhYQj4iomIJ+YiIiiXkIyIqlpCPiKhYQj4iomJdhbyk2yW9\nJOlTZXuTpElJFyWdlrSxre0hSZckXZC0s61+u6RzZd+R3l9KREQs1O1M/gPAea5/39gYMGl7G3Cm\nbCNpBNgLjACjwFFd/zLKp4EDtoeBYUmjvbmEiIi4kSVDXtI9wF8DPsb1L4h8BJgo5QlgdynvAo7b\nvmJ7GrgM7JC0Gdhge6q0O9Z2TERE9Ek3M/l/A/w08Fpb3ZDt+VKeB4ZKeQsw09ZuBtjaoX621EdE\nRB+tW2ynpL8BvGL7JUmNTm1sW9JKvjb+hsbHx18vNxoNGo2OXUdE3JKazSbNZrOrtrJvnM+SPgw8\nBlwFvhv448AvAn8BaNieK0sxZ23fL2kMwPaT5fhTwGHg5dLmgVL/KPCw7cc79OnFxjQorVsJKxmH\nWA3jv9XV/vzVfn2xPJKwrU77Fl2usf1B2/favg/YB/yy7ceA54H9pdl+4LlSfh7YJ+kOSfcBw8CU\n7TngVUk7yo3Yx9qOiYiIPll0uaaDa1OAJ4ETkg4A08AeANvnJZ2g9Zc4V4GDbdPyg8AzwHrgpO1T\nNzf0iIhYyqLLNW+GLNdEL9T+/NV+fbE8K16uiYiItS0hHxFRsYR8RETFEvIRERVLyEdEVCwhHxFR\nsYR8RETFEvIRERVLyEdEVCwhHxFRsYR8RETFEvIRERVLyEdEVCwhHxFRsYR8RETFEvIRERVLyEdE\nVCwhHxFRsYR8RETFFg15Sd8t6UVJX5R0XtI/L/WbJE1KuijptKSNbcccknRJ0gVJO9vqt0s6V/Yd\n6d8lRUTENYuGvO3/A7zH9g8A7wTeI+mHgTFg0vY24EzZRtIIsBcYAUaBo2p94zDA08AB28PAsKTR\nflxQRERct+Ryje3fL8U7gNuB3wEeASZK/QSwu5R3AcdtX7E9DVwGdkjaDGywPVXaHWs7JiIi+mTJ\nkJd0m6QvAvPAWdtfAYZsz5cm88BQKW8BZtoOnwG2dqifLfUREdFH65ZqYPs14AckfQ/wWUnvWbDf\nktzLQY2Pj79ebjQaNBqNXp4+ImJNazabNJvNrtrK7j6fJf1T4A+A9wEN23NlKeas7fsljQHYfrK0\nPwUcBl4ubR4o9Y8CD9t+vEMfXs6Y+qV1K2El4xCrYfy3utqfv9qvL5ZHErbVad9Sf13z1mt/OSNp\nPfBXgJeA54H9pdl+4LlSfh7YJ+kOSfcBw8CU7TngVUk7yo3Yx9qOiYiIPllquWYzMCHpNlo/EH7O\n9hlJLwEnJB0ApoE9ALbPSzoBnAeuAgfbpuUHgWeA9cBJ26d6fTEREfFGy1quGYQs10Qv1P781X59\nsTwrXq6JiIi1LSEfEVGxhHxERMUS8hERFUvIR0RULCEfEVGxhHxERMUS8hERFUvIR0RULCEfEVGx\nhHxERMUS8hERFUvIR0RULCEfEVGxhHxERMUS8hERFUvIR0RULCEfEVGxhHxERMWWDHlJ90o6K+kr\nkn5D0hOlfpOkSUkXJZ2WtLHtmEOSLkm6IGlnW/12SefKviP9uaSIiLimm5n8FeAf2P4zwLuBn5T0\nADAGTNreBpwp20gaAfYCI8AocFStbx0GeBo4YHsYGJY02tOriYiIN1gy5G3P2f5iKf8e8FVgK/AI\nMFGaTQC7S3kXcNz2FdvTwGVgh6TNwAbbU6XdsbZjIiKiD5a1Ji/pbcCDwIvAkO35smseGCrlLcBM\n22EztH4oLKyfLfUREdEn67ptKOktwCeBD9j+zvUVGLBtSe7VoMbHx18vNxoNGo1Gr04dEbHmNZtN\nms1mV21lL53Nkv4Y8F+Az9j+t6XuAtCwPVeWYs7avl/SGIDtJ0u7U8Bh4OXS5oFS/yjwsO3HF/Tl\nbsbUb60fYisZh1gN47/V1f781X59sTySsK1O+7r56xoBHwfOXwv44nlgfynvB55rq98n6Q5J9wHD\nwJTtOeBVSTvKOR9rOyYiIvpgyZm8pB8GfgX4MtenDoeAKeAE8L3ANLDH9rfLMR8E3gtcpbW889lS\nvx14BlgPnLT9RIf+MpOPm1b781f79cXyLDaT72q5ZpAS8tELtT9/tV9fLM9NLddERMTalZCPiKhY\nQj4iomIJ+YiIiiXkIyIqlpCPiKhYQj4iomIJ+YiIiiXkIyIq1vWnUEZd2j9FdLnyjsmItSMhf0tb\n2dviI2LtyHJNRETFEvIRERVLyEdEVCwhHxFRsYR8RETFEvIRERVLyEdEVKybL/L+GUnzks611W2S\nNCnpoqTTkja27Tsk6ZKkC5J2ttVvl3Su7DvS+0uJiIiFupnJ/ywwuqBuDJi0vQ04U7aRNALsBUbK\nMUd1/a2VTwMHbA8Dw5IWnjMiInpsyZC3/d+A31lQ/QgwUcoTwO5S3gUct33F9jRwGdghaTOwwfZU\naXes7ZiIiOiTla7JD9meL+V5YKiUtwAzbe1mgK0d6mdLfURE9NFN33h169Oq8olVERGr0Eo/oGxe\n0t2258pSzCulfha4t63dPbRm8LOl3F4/e6OTj4+Pv15uNBo0Go0VDjMioj7NZpNms9lVW3XzsbGS\n3gZ8yvY7yvZHgG/afkrSGLDR9li58fos8BCt5ZgXgLfbtqQXgSeAKeDTwEdtn+rQl1fDR9m27hev\n7FMaV8P4l5Lru+GRK7q+QX90c+3PXyyPJGx3fBEuOZOXdBx4GHirpK8D/wx4Ejgh6QAwDewBsH1e\n0gngPHAVONiW2AeBZ4D1wMlOAR+xtuWjm2P16WomP0iZyQ9Gru+GR66JmXXtz18sz2Iz+bzjNSKi\nYgn5iIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKJeQjIiqWkI+I\nqFhCPiKiYgn5iIiKJeQjIiqWkI+IqFhCPiKiYgn5iIiKLfkdrxERNRv0l7AP2sBn8pJGJV2QdEnS\nP17GcSt+REQszit4rA0DDXlJtwP/DhgFRoBHJT3Q/Rlu9I99dpF9/dDs03lv0FtzsP3l+tLfsnob\n8PM32P5639egJ6yDnsk/BFy2PW37CvDzwK6bP23z5k+xivurPQRrv77a+0vIr8SNJqWHF9m3MoNe\nk98KfL1tewbYMeAxRMQqt9Ss9UMf+tAN962FdfJBGvRMPv/6EdGlwc12a6ZB/tST9G5g3PZo2T4E\nvGb7qbY2eaYiIpbJdsdffwYd8uuArwE/AnwDmAIetf3VgQ0iIuIWMtA1edtXJf094LPA7cDHE/AR\nEf0z0Jl8REQM1pp6x6ukf7igyrb/ddn3mO2fexOG1TeSNtj+Tim/3fblN3tMN0OSvMSsops2PRjH\nZuBbtv9vP/spfd1te67f/ayWfvtl4fX06/o69NOX14qkPwE8DvwB8DHbr/by/O3W2mfXbADe0vbY\n0Lbvzn50KOlseXyyH+dfwuckPSdpL3C6Hx1I+s3yeLEf51+gKemnJW3rMI7vL++A/q8DGMd/Ar4m\n6V8OoK+TA+ijk4/3+oSS/qLevLeQL7yenl/fDc7br9fKJ4G7gHuAX5P0fT0+/+uyXLMESX+6FP/I\n9kyf+7oL+MPyRrFrdQdpvUt4n+0T/ey/3yR9F/ATwKPAnwW+A4jWD+zfAD4BPGv7DwcwltuAB2x/\npc/9vGT7wX72MSiS/j2t97VcBD4DnKrpt4Ub6cdrRdKXbb+zlP8q8DHg28A/At5n+8d71ldCfnGS\nfrMUX7Hd1zduldn0btu/Vbb/FvAvgJ8CftL2X+9n/4NUPuLirWXzf9n+ozdzPP0i6aDto2/2OHqp\nfBTJjwE7gY3ALwOngM/V+jz2mqTPAT9he7ps3wZsAb4FbLT9jZ71lZBfPSR9yfafK+W/C/wT4Eds\nX5T0Bdvb39wRRryRpDuB99AK/R/Ma7Q7ku6ndU/xa33vKyG/ekg6S+vDMu4F3gv8ZdtNSX8KeOHa\nr3cREd1aazdea/fjwGu01jz3AB+XNAH8KvDUYgdGRHSSmfwqJmkr8EPAlwbxa11E1CchHxFRsSzX\nRERULCEfEVGxhHxERMUS8hERFUvIR0RU7P8BnaAgGCuaansAAAAASUVORK5CYII=\n",
190 "<matplotlib.figure.Figure at 0x7ff4481c1b38>"
194 "output_type": "display_data"
198 "sherlock_ps.plot(kind=\"bar\")"
202 "cell_type": "markdown",
205 "Now we can read and process a novel, wrap that into a function and read some other novels"
210 "execution_count": 74,
216 "def punct_summarise(fname):\n",
217 " content = open(fname).read()\n",
218 " punct = ''.join(c for c in content if c in string.punctuation)\n",
219 " counts = collections.Counter(punct)\n",
220 " return {'punctuation': punct, 'counts': counts}"
225 "execution_count": 75,
233 "Counter({'!': 171,\n",
247 "execution_count": 75,
249 "output_type": "execute_result"
253 "# Complete Sherlock Holmes\n",
254 "sherlock = punct_summarise('sherlock-holmes.txt')\n",
260 "execution_count": 76,
268 "Counter({'!': 3923,\n",
290 "execution_count": 76,
292 "output_type": "execute_result"
296 "wap = punct_summarise('war-and-peace.txt')\n",
302 "execution_count": 77,
310 "Counter({'!': 10815,\n",
324 "execution_count": 77,
326 "output_type": "execute_result"
330 "# Complete works of Shakespeare\n",
331 "shakespeare = punct_summarise('shakespeare.txt')\n",
332 "shakespeare['counts']"
337 "execution_count": 78,
345 "Counter({'!': 1576,\n",
364 "execution_count": 78,
366 "output_type": "execute_result"
370 "ulysses = punct_summarise('ulysses.txt')\n",
376 "execution_count": 79,
384 "Counter({'!': 500,\n",
406 "execution_count": 79,
408 "output_type": "execute_result"
412 "pap = punct_summarise('pride-and-prejudice.txt')\n",
417 "cell_type": "markdown",
420 "Place all the counts into a Pandas dataframe, normalise them, and then plot them."
425 "execution_count": 80,
434 "<table border=\"1\" class=\"dataframe\">\n",
436 " <tr style=\"text-align: right;\">\n",
439 " <th>shakespeare</th>\n",
440 " <th>sherlock</th>\n",
441 " <th>ulysses</th>\n",
635 " pap shakespeare sherlock ulysses wap\n",
636 "! 500 10815 171 1576 3923\n",
637 "\" 3553 6 4834 8 17970\n",
642 "' 748 27942 1490 4485 7529\n",
643 "( 38 0 5 1777 670\n",
644 ") 38 0 0 1788 670\n",
647 ", 9280 82750 7053 16349 39891\n",
648 "- 1193 4590 965 5037 6308\n",
649 ". 6396 36881 4843 21361 30805\n",
651 ": 155 10649 56 2564 1014\n",
652 "; 1538 17400 202 34 1145\n",
654 "? 462 10327 138 2235 3137\n",
661 "execution_count": 80,
663 "output_type": "execute_result"
667 "punctuation = pd.DataFrame({'sherlock': sherlock['counts'],\n",
668 " 'wap': wap['counts'],\n",
669 " 'shakespeare': shakespeare['counts'],\n",
670 " 'ulysses': ulysses['counts'],\n",
671 " 'pap': pap['counts']})\n",
672 "punctuation.fillna(value=0, inplace=True)\n",
678 "execution_count": 109,
686 "<matplotlib.legend.Legend at 0x7ff44264e7f0>"
689 "execution_count": 109,
691 "output_type": "execute_result"
695 "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfoAAAEECAYAAADAjfYgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xt8VNW5//HPQ7grCQjILWBA8QJ4iopUEWSwYtEiWBTl\nImK1VrQq1NZaPKLBeqSKClU5UlpFq9YLtSh4wdRCEIqioCLoEX6gAgKCaLgUkATy/P7YkzgZMsmQ\nC5kZvu/Xa15k9l5r7bUnQ569Lnttc3dEREQkNdWq6QqIiIhI9VGgFxERSWEK9CIiIilMgV5ERCSF\nKdCLiIikMAV6ERGRFKZALyIiksLiCvRmdr2ZfW5me8xsiZn1jDNfRzPbaWY7o7aHzKywlNfxFTkJ\nERERKV25gd7MLgMmA3cDXYFFwOtm1racfHWB54D5QKxVeToBLSNeq+OuuYiIiJQrnhb9zcB0d3/M\n3Ve6+03AJuC6cvLdC3wIzAAsRpqv3X1LxKsw7pqLiIhIucoM9OFW+alATtSuHKBHGfl+AvwEuJHY\nQR5giZltNLM3zSwUV41FREQkbuW16JsBacDmqO1bCLraD2BmrYFpwHB33x2j3I3AKGBQ+LUS+Fe8\nY/8iIiISn9rVUOZTwKPu/l6sBO6+ClgVsekdM8sCbgEWRqY1Mz11R0SkAty9rB5VOUyU16LfCuwH\nWkRtb0EwTl+aPsCdZlZgZgXAX4Ajwu9/Xsax3gU6lrbD3WO+7rzzzjL3V3W+ZDtmstVXx0zMvDpm\n8h1TpEiZLXp3zzezpcB5wIsRu/oSTLIrTZeo9xcB/w2cTtBlH0vXcvaLiIjIQYqn6/5B4Ckze5fg\n1rpRBOPzUwHMbAJwurufC+Dun0RmNrPuQGHkdjMbA3wOfALUBS4HBhKM14uIiEgVKTfQu/sLZtYU\nuB1oBSwHLnD39eEkLYEO5RUT9b4OMBHIBPYAK8JlzjmIugMQCoUONkul8iXbMSuTV8dMrWNWJq+O\nmVrHlMOLJfpYjpl5otdRRCTRmBmuyXiC1roXERFJaQr0IiIiKUyBXkREJIUp0IuIiKQwBXoREZEU\npkAvIiKSwhToRUREUpgCvYiISApToBcREUlhCvQiIiIpTIFeREQkhSnQi4iIpDAFehERkRSmQC8i\nIpLC4gr0Zna9mX1uZnvMbImZ9YwzX0cz22lmO0vZ19vMlobLXGNm1x5s5UVqiplhpieAikjiKzfQ\nm9llwGTgbqArsAh43czalpOvLvAcMB/wqH3tgdeAheEyJwAPm9mgCpyDiIiIxGDuXnYCs8XAh+5+\nbcS2VcDf3f22MvJNAtKBt4BH3L1RxL57gYvc/YSIbX8GOrt7j6hyvLw6ihxqRa15fTclUZkZ7q5u\nJym7RR9ulZ8K5ETtygF6HJijON9PgJ8ANwKlfdHOjFFmNzNLK6fOIiIiEqfyuu6bAWnA5qjtW4CW\npWUws9bANGC4u++OUW6LUsrcDNQOH1NERESqQO1qKPMp4FF3f6+qCszOzi7+ORQKEQqFqqpoEZGU\nkJubS25ubk1XQxJQmWP04a77XcAQd38xYvsUoJO79yklTyGwP3ITQc/BfuA6d/+Lmc0Hlrv7DRH5\nBgPPAA3cfX/Edo3RS8LRGL0kOo3RS5Eyu+7dPR9YCpwXtasvwez70nQBfhDxugPYE/757+E0b4fL\niC7zvcggLyIiIpUTT9f9g8BTZvYuQXAfRTA+PxXAzCYAp7v7uQDu/klkZjPrDhRGbZ8K3BCemT8N\nOAsYCQyp3OmIiIhIpHIDvbu/YGZNgduBVsBy4AJ3Xx9O0hLoUF4xUWV+YWYXAJOA64ANwI3uPvMg\n6y8iIiJlKPc++pqmMXpJRBqjl0SnMXoporXuRUREUpgCvYiISApToBcREUlhCvQiIiIpTIFeREQk\nhSnQi4iIpDAFehERkRRWHQ+1EUlZRffPi4gkC7XoRQ6aFskRkeShQC8iIpLCFOhFRERSmAK9iIhI\nClOgFxERSWEK9CIiIiksrkBvZteb2edmtsfMlphZzzLSdjKzeWb2VTj9GjP7HzOrE5EmZGaFpbyO\nr4qTEhERkUC599Gb2WXAZOA6YCHwS+B1M+vk7utLybIXmA58AGwDugJ/BuoCt0Sl7QR8G/F+68Ge\ngIiIiMRm7mXfE2xmi4EP3f3aiG2rgL+7+21xHcTsQeAMd+8Rfh8C5gLN3f2bcvJ6eXUUOVSCBXMc\nKLlwjr6jkmjMDHfXCk9Sdte9mdUFTgVyonblAD3iOYCZHQf8uJQyAJaY2UYzezMc/EWSisK7iCS6\n8sbomwFpwOao7VuAlmVlNLNFZrYHWAUsdvfsiN0bgVHAoPBrJfCvssb+RRKZmWl5XBFJSNW51v2l\nwJEEY/QTzew+d/8tgLuvIrgAKPKOmWURjOEvjC4oOzu7+OdQKEQoFKquOouIJKXc3Fxyc3NruhqS\ngMocow933e8Chrj7ixHbpwCd3L1PXAcxGw48DjR09/0x0twJXObunaK2a4xeEkb0GH30aL2+q5Io\nNEYvRcrsunf3fGApcF7Urr7AooM4Tlr4WGUdrytBl76IiIhUkXi67h8EnjKzdwmC+yiC8fmpAGY2\nATjd3c8Nvx8B7AFWAPlAN+Ae4Hl3LwinGQN8DnxCcNvd5cBAgvF6ESlH0XwA9SCISHnKDfTu/oKZ\nNQVuB1oBy4ELIu6hbwl0iMhSAIwFOhL0aq4FHgEmRaSpA0wEMvn+ouACd59TqbMRERGREsq9j76m\naYxeEkmijNGrRS/l0Ri9FNFa9yIiIilMgV5ERCSFKdCLiIikMAV6ERGRFKZALyIiksIU6EVERFKY\nAr2IiEgKq86H2oiISIIxMy2+kMJKWztBgV5E5DCjhZZSU6xHZavrXkREJIUp0IuIiKQwBXoREZEU\npkAvIiKSwjQZT0TkMBdrEldV0gTAmhNXi97Mrjezz81sj5ktMbOeZaTtZGbzzOyrcPo1ZvY/ZlYn\nKl1vM1sakebayp6MiIhUlFfjS2pSuYHezC4DJgN3A12BRcDrZtY2Rpa9wHSgL3A8MAa4Grgnosz2\nwGvAwnCZE4CHzWxQhc9ERESSWlZWFn/4wx/o3LkzRx11FFdddRV79+4lLy+P/v37c/TRR3PUUUdx\n4YUXsmHDhuJ8oVCIsWPH8sMf/pCMjAwuuugi8vLyavBMEouV151iZouBD9392ohtq4C/u/ttcR3E\n7EHgDHfvEX5/L3CRu58QkebPQOeiNBHbXV0+kiiCLk4Hgq7O738KHKrvalFXq/5vSCxmVuriKaX9\nTf3+e11ttYnru5qVlUV6ejqvv/46DRs25MILL6RPnz786le/Yv78+Zx//vns27ePq666ioKCAmbO\nnAkEgX716tXk5OSQlZXFFVdcQYMGDXjqqaeq8ZwST6zfeZktejOrC5wK5ETtygF6HJij1DKOA34c\nVcaZMcrsZmZp8ZQrIiKpxcy44YYbaNOmDU2aNOG///u/efbZZznqqKP46U9/Sv369TnyyCO57bbb\nmD9/fol8V1xxBZ06daJhw4b8/ve/54UXXtCFcFh5XffNgDRgc9T2LUDLsjKa2SIz2wOsAha7e3bE\n7hallLmZYHJgs3LqJCIiKapt2+9Hhdu1a8fGjRvZs2cP1157LVlZWWRkZNC7d2+2b99eIpBH5yso\nKGDr1q2HtO6Jqjpn3V8KHEkwBj/RzO5z999WpKDs7Ozin0OhEKFQqCrqJyKSMnJzc8nNza3palTa\nunXrSvzcunVrHnjgAVatWsW7777L0UcfzYcffsipp56KuxcPY0Xnq1OnDs2aqd0I5YzRh7vudwFD\n3P3FiO1TgE7u3ieug5gNBx4HGrr7fjObDyx39xsi0gwGngEauPv+iO0ao5eEoTF6SRbJOkafkZHB\na6+9RoMGDRgwYAChUIiCggKWL1/OzJkz2bVrF1dffTUvv/wy+/bto1atWoRCIdasWUNOTg7HHHMM\nI0eOpF69ejz99NPVeE6Jp0Jj9O6eDywFzova1Zdg9n280sLHKjre2+Eyost8LzLIi4jI4cPMGDZs\nGOeddx7HHnssHTt25Pbbb2fMmDHs2bOHZs2a0aNHD84///wS9/6bGSNGjODKK6+kVatW5Ofn89BD\nD9XgmSSWeGbdXwo8BVxPENxHAT8jmCG/3swmAKe7+7nh9COAPcAKIB/oBjwI5Lr75eE0WeH9fwam\nAWcBUwh6DmZGHV8tekkYatFLsjj4Fn31iue72r59ex577DHOOeecgyq7T58+jBgxgquuuqqi1UsJ\nsX7n5Y7Ru/sLZtYUuB1oBSwHLnD39eEkLYEOEVkKgLFAR4K/gWuBR4BJEWV+YWYXhLddB2wAbowO\n8iIiUv1S4YIxFc6husQ1Gc/dHwUejbHvZ1HvnwOei6PMt4DT4jm+iIhIWQ5Fr0SyKrfrvqap614S\nibruJVkcTNe9pIYKTcYTERGR5KZALyIiksIU6EVERFKYAr2IiEgKU6AXERFJYQr0IiKS0LKzsxkx\nYkSF8j7xxBP06tWrimuUXKrzoTYiIpIEEmVlvFh0j3zlKNCLJDHdTy9VJjtxyz5cv9/79+8nLS2t\n0uWo614kic1jXk1XQaRK3XvvvWRmZpKens6JJ57I3LlzMTPy8/MZOXIk6enpdOnShaVLlxbn+cMf\n/sBxxx1Heno6nTt35qWXXopZ/i233EKvXr3YuXMn27dv5+qrr6Z169ZkZmYybtw4CgsLAVi9ejW9\ne/emcePGNG/enCFDhhSXUatWLR5++GGOPfZYmjdvzm9/+9sSFyOPP/44nTp14qijjqJfv34lHqE7\nevRo2rVrR0ZGBt26dWPhwoXF+7Kzs7nkkksYMWIEGRkZPPnkk2XWMV4K9CIikhBWrlzJlClTWLJk\nCTt27CAnJ4esrCzcnVmzZjF06FC2b9/OgAEDuOGG4qecc9xxx7Fw4UJ27NjBnXfeyeWXX87mzZtL\nlO3uXHPNNaxYsYJ//vOfNGrUiCuvvJK6deuyZs0aPvjgA3JycvjLX/4CwLhx4+jXrx/btm1jw4YN\n3HTTTSXKe+mll1i6dCnvv/8+L7/8Mo8//jgAL7/8MhMmTGDmzJls3bqVXr16MXTo0OJ83bt3Z9my\nZeTl5TFs2DAGDx5Mfn5+8f5Zs2YxePBgtm/fzrBhw8qsY7wU6EVEJCGkpaWxd+9ePv74YwoKCmjX\nrh0dOgTPTOvVqxf9+vXDzLj88stZtmxZcb5LLrmEli1bAnDppZfSsWNHFi9eXLy/oKCAIUOGsG3b\nNmbPnk39+vXZvHkzr7/+OpMmTaJBgwY0b96cMWPG8NxzwaNa6tatyxdffMGGDRuoW7cuPXr0KFHX\nW2+9lcaNG9O2bVvGjBnDs88+C8DUqVMZO3YsJ5xwArVq1WLs2LF8+OGHrF8fPAdu+PDhNGnShFq1\nanHzzTezd+9eVq5cWVxujx49GDBgAADbt28vs47xUqAXEZGEcNxxxzF58mSys7Np0aIFQ4cOZdOm\nTQC0aNGiOF3Dhg357rvviruw//rXv3LKKafQpEkTmjRpwooVK/jmm2+K069evZrZs2dzxx13ULt2\nMDVt7dq1FBQU0KpVq+J8o0aN4uuvvwbgvvvuw93p3r07Xbp0Yfr06SXq2rZt2+Kf27Vrx8aNG4vL\nHT16dHGZTZs2BWDDhg0A3H///XTq1InGjRvTpEkTtm/fztatW4vLyszMLP65vDrGK65Ab2bXm9nn\nZrbHzJaYWc8y0obM7GUz22hmu8xsmZn9rJQ0haW8jj+o2ouISEoZOnQoCxYsYO3atZgZt956a5mz\n7teuXcsvfvELpkyZwrfffkteXh5dunQpMWZ+0kkn8fjjj3P++eezatUqIAjU9erV45tvviEvL4+8\nvDy2b9/O8uXLgeDCYtq0aWzYsIE//elPXH/99Xz22WfFZUaOu69bt442bdoAQdCfNm1acZl5eXns\n2rWLM844gwULFjBx4kRmzJjBtm3byMvLIyMjo0RdI8+1vDrGq9xAb2aXAZOBu4GuwCLgdTNrGyPL\nmcAy4GKgM8HjbaeZ2dBS0nYieJ590Wv1QdVeRERSxqpVq5g7dy579+6lXr161K9fv9xZ57t27cLM\naNasGYWFhUyfPp0VK1YckG7IkCHcc889nHvuuXz22We0atWK8847j5tvvpmdO3dSWFjImjVreOut\ntwCYMWMGX375JQCNGzfGzKhV6/uQef/997Nt2zbWr1/PQw89xGWXXQbAqFGjuOeee/jkk0+AoPt9\nxowZAOzcuZPatWvTrFkz8vPzueuuu9ixY0fMcyuvjvGKp0V/MzDd3R9z95XufhOwCbiutMTuPsHd\n73D3t939C3efCvyDIPBH+9rdt0S8Dm4qochhwMyKXyKpbO/evYwdO5bmzZvTqlUrtm7dyoQJE4AD\n76Uvet+pUyd+/etfc+aZZ9KyZUtWrFhBz549S6QrSnvFFVdwxx13cM4557Bu3Tr++te/kp+fXzxD\nfvDgwXz11VcALFmyhDPOOINGjRoxcOBAHnroIbKysorLHThwIKeddhqnnHIK/fv356qrrgLgoosu\n4tZbb2XIkCFkZGRw8skn88YbbwDQr18/+vXrx/HHH09WVhYNGjSgXbt2pda1SFl1jFeZz6M3s7rA\nLmCIu78Ysf0RoIu7h+I6iNkcYJ27/yL8PgTMBdYC9YBPgLvdPbeUvHp2siSMmngefeQxi8ov+mMw\nj3n0oc9he5+xxHYwz6NP9AVzEk2tWrVYvXp18UTBRBHrd17egjnNgDRgc9T2LQRd7fEcuD9wDhA5\nZXEjMAp4jyDQjwD+ZWa93X3hgaWIiEh1SaUgLAeq1pXxzOws4BngRndfUrTd3VcBqyKSvmNmWcAt\ngAK9iIgkrGQbRisv0G8F9gMtora3IBinjyk8M/9VYJy7/ymOurwLXFbajuzs7OKfQ6EQoVAojuJE\nRA4fubm55Obm1nQ1Dgv79++v6SoclDLH6AHM7B1gmbtfG7FtFTDD3f87Rp6zgVeAO9x9clwVMZsJ\nNHL3c6O2a4xeEobG6CVZHMwYvaSGio7RAzwIPGVm7xLcWjeKYHx+arjgCcDpRQE6PNHuVeAR4Fkz\nKxrL3+/uX4fTjAE+J5iEVxe4HBgIDKroCR4sPQxEklGydRmKSM0rN9C7+wtm1hS4HWgFLAcucPf1\n4SQtgciphyOB+gTj7bdEbP8iIl0dYCKQCewBVoTLnFPhMxE5TET3IoiIlKXcrvuaVl3dTGrRS0XU\ndNd99DHVdS+xqOv+8BPrd6617kVERFKYAr2IiCS07OxsRowYkVDlPfHEE/Tq1auKalS9qvU+ehER\nSXyJvjJeVdfvcJvUqkAvIiJU56h9ZcNqVc4p2LdvX5WVlSzUdS8iIgnj3nvvJTMzk/T0dE488UTm\nzp2LmZGfn8/IkSNJT0+nS5cuLF26tDjPxo0bufjiizn66KPp0KEDDz/8cPG+7OxsLrnkEkaMGEFG\nRgZPPvnkAcecNWsWnTt3pkmTJvTp04dPP/20eN/69esZNGgQRx99NM2aNePGG28std633HILvXr1\nKvNpdDVFgV5ERBLCypUrmTJlCkuWLGHHjh3k5OSQlZWFuzNr1iyGDh3K9u3bGTBgADfccAMAhYWF\nXHjhhZxyyils3LiRf/3rX0yePJmcnJzicmfNmsXgwYPZvn07w4cPL3HMVatWMWzYMB566CG2bt3K\nBRdcwIUXXsi+ffvYv38//fv3p3379qxdu5YNGzYwdGjJJ667O9dccw0rVqzgn//8J+np6dX/QR0k\nBXoREUkIaWlp7N27l48//piCggLatWtX/IS4Xr160a9fP8yMyy+/nGXLlgHw3nvvsXXrVm6//XZq\n165N+/bt+fnPf85zzz1XXG6PHj0YMGAAAPXr1y8xFPD888/Tv39/fvSjH5GWlsZvfvMb9uzZw7//\n/W/effddNm3axMSJE2nQoAH16tWjR4/vn89WUFDAkCFD2LZtG7Nnz6Z+/fqH4mM6aBqjFxGRhHDc\ncccxefJksrOz+fjjj/nxj3/Mgw8+CECLFt8/cqVhw4Z89913FBYWsnbtWjZu3EiTJk2K9+/fv5+z\nzz67+H1mZmbMY27cuPGAZ8K3bduWDRs2UKdOHY455hhq1Sq9Tbx69Wo++ugjFi9eTO3aiRtO1aIX\nEZGEMXToUBYsWMDatWsxM2699dYyZ8m3bduW9u3bk5eXV/zasWMHr7zyChAE7rLyt2nThrVr1xa/\nd3fWr19PZmYmbdu2Zd26dTEfYnPSSSfx+OOPc/7557Nq1apS0yQCBXoREUkIq1atYu7cuezdu5d6\n9epRv3590tLSyszTvXt3GjVqxH333ceePXvYv38/K1asYMmS4Mno5c3YHzx4MK+++ipz586loKCA\nBx54gPr169OjRw9OP/10WrVqxe9+9zt2797Nd999x6JFi0rkHzJkCPfccw/nnnsun332WeU+gGqi\nQC8iIglh7969jB07lubNm9OqVSu2bt3KhAkTgAPvfS96n5aWxiuvvMKHH35Ihw4daN68Ob/4xS+K\nZ7+X1qKP3HbCCSfw9NNPc+ONN9K8eXNeffVVZs+eTe3atUlLS2P27NmsXr2adu3a0bZtW1544YUD\nyrjiiiu44447OOecc1i3bl31fUAVpLXuE/z8JbForXtJFgez1n2iL5gj8anMY2pFRCSFKQinNnXd\ni4iIpLC4Ar2ZXW9mn5vZHjNbYmY9y0gbMrOXzWyjme0ys2Vm9rNS0vU2s6XhMteY2bWVORERERE5\nULmB3swuAyYDdwNdgUXA62bWNkaWM4FlwMVAZ+BRYJqZFS8nZGbtgdeAheEyJwAPm9mgip+KiIiI\nRCt3Mp6ZLQY+dPdrI7atAv7u7rfFdRCz54E0d78k/P5e4CJ3PyEizZ+Bzu7eIyqvJuNJwtBkPEkW\nBzMZT1JDrN95mS16M6sLnArkRO3KAXocmCOmDODbiPdnxiizm5mVfdOkiIiIxK28WffNgDRgc9T2\nLUDLeA5gZv2Bcyh5YdCilDI3h+vTrJR9IiIiUgHVenudmZ0FPAPc6O5LKlpOdnZ28c+hUIhQKFTp\nuomIpJLc3Fxyc3NruhqSgMocow933e8Chrj7ixHbpwCd3L1PGXl7Aq8C49z9oah984Hl7n5DxLbB\nBBcFDdx9f8R2jdFLwtAYvSSLVBujz83NZcSIEaxfv76mq5KwKjRG7+75wFLgvKhdfQlm38c62NkE\ns+rvjA7yYW+Hy4gu873IIC8iItWvaDnX6nxJzYmn6/5B4Ckze5cguI8iGJ+fCmBmE4DT3f3c8PsQ\nQUv+EeBZMysay9/v7l+Hf54K3GBmk4BpwFnASGBIVZyUiIgcpHnzqq/sPjE7f+UQKPc+end/ARgD\n3A58QDCp7gJ3L+o/aQl0iMgyEqgP3AJsAjaGX4sjyvwCuAA4O1zmWIJx/JmVOx0REUlWtWrVKvEE\nuCuvvJJx48YdkG7ixIlccsklJbbddNNNjBkzBoAnnniCY489lvT0dDp06MDf/vY3IHh+fO/evWnc\nuDHNmzdnyJDv25affvopffv2pWnTppx44onMmDGjeN9rr71G586dSU9PJzMzkwceeKBKz7u6xTUZ\nz90fJVj4prR9Pyvl/QEr4ZWS7y3gtHiOLyIih59Y3f6XX34548ePZ/v27WRkZLBv3z6ef/555syZ\nw65duxg9ejRLliyhY8eObN68mW+++QaAcePG0a9fP+bPn09+fn7xo2x37dpF3759ufvuu3njjTf4\n6KOP6Nu3LyeffDInnngiV199NX//+98566yz2L59e8I+jjYWrXUvIiIJq7SJg61ataJXr17Fre45\nc+bQrFkzTjnlFCDoGVi+fDl79uyhRYsWdOrUCYC6devyxRdfsGHDBurWrUuPHsFd36+88grt27dn\n5MiR1KpVi65duzJo0KDiR9LWrVuXjz/+mB07dpCRkVF8nGShQC8iIkln5MiRPP300wA8/fTTXHHF\nFQAcccQRPP/880ydOpXWrVvTv39/Vq5cCcB9992Hu9O9e3e6dOnC9OnTAVi7di2LFy+mSZMmxa+/\n/e1vbN4cLOny4osv8tprr5GVlUUoFOKdd96pgTOuOAV6ERFJCA0bNmT37t3F7zdt2hRzxv7AgQP5\n6KOPWLFiBa+++irDhw8v3nfeeeeRk5PDV199xYknnsg111wDQIsWLZg2bRobNmzgT3/6E9dffz1r\n1qyhXbt29O7dm7y8vOLXzp07mTJlCgDdunXjpZde4uuvv+aiiy7i0ksvrcZPoeop0IuISELo2rUr\nzzzzDPv372fOnDm89dZbMdM2aNCAiy++mGHDhvHDH/6QzMxMALZs2cLLL7/Mrl27qFOnDkcccQRp\nacHK6jNmzODLL78EoHHjxpgZaWlp9O/fn1WrVvH0009TUFBAQUEB7733Hp9++ikFBQU888wzbN++\nnbS0NBo1alRcXrJQoBcRkYTwxz/+kdmzZxd3nf/0pz8tsT+6dT9y5EhWrFjBiBEjircVFhYyadIk\n2rRpQ9OmTVmwYAGPPhrMJV+yZAlnnHEGjRo1YuDAgTz00ENkZWVx5JFHkpOTw3PPPUebNm1o1aoV\nY8eOJT8/HwiGBtq3b09GRgbTpk3jmWeeqeZPomqV+/S6mqaV8SSRaGU8SRYHszLeoVjQpjq+o+vX\nr+fEE09k8+bNHHnkkVVefrKJ9Tuv1rXuRUQk8SXjhWJhYSEPPPAAQ4cOVZAvhwK9iIgklV27dtGi\nRQvat2/PnDlzaro6CU+BXkREksoRRxzBf/7zn5quRtLQZDwREZEUpkAvIiKSwhToRUREUpgCvYiI\nSApToBcREUlhcQV6M7vezD43sz1mtsTMepaRtp6ZPWFmy8ws38zmlZImZGaFpbyOr8zJiIiISEnl\n3l5nZpcBk4HrgIXAL4HXzayTu68vJUsasAd4GPgJkFFG8Z2AbyPeb42z3lUmckWoZFw0QkSkspJ1\nZTyJTzz30d8MTHf3x8LvbzKzfgSB/7boxO6+O7wPM+sKNC6j7K/d/ZuDq3LVK1pGVETkcDWPAzpf\nq4z+vtbUBSYFAAAUMElEQVSsMrvuzawucCqQE7UrB+hRBcdfYmYbzexNMwtVQXkiIpKkpk+fzoAB\nA4rfd+zYscQjYdu2bcuyZcsYPXo07dq1IyMjg27durFw4cLiNNnZ2VxyySUMGTKE9PR0TjvtND76\n6KNDeh6Jprwx+mYEXfGbo7ZvAVpW4rgbgVHAoPBrJfCvssb+RUQktYVCIRYsWADAxo0bKSgo4J13\n3gHgs88+Y9euXfzgBz+ge/fuLFu2jLy8PIYNG8bgwYOLnzQHMGvWLC699NLi/RdddBH79u2rkXNK\nBDWyBK67rwJWRWx6x8yygFsI5gGUkJ2dXfxzKBQiFApVa/1ERJJNbm4uubm5NV2NSmnfvj2NGjXi\ngw8+YOXKlfz4xz9m2bJlrFy5kkWLFnH22WcDMHz48OI8N998M3fffTcrV67k5JNPBqBbt24MGjSo\neP8DDzzAO++8Q8+eh2dbsrxAvxXYD7SI2t4C2FTFdXkXuKy0HZGBXkREDhTdCBo/fnzNVaYSevfu\nTW5uLqtXr6Z37940btyY+fPn8/bbb9O7d28A7r//fh5//HE2btyImbFjxw62bv1+LndmZmbxz2ZG\nZmYmmzZVdchKHmV23bt7PrAUOC9qV19gURXXpStBl76IiBymevfuzbx581iwYAGhUKg48M+fP5/e\nvXuzYMECJk6cyIwZM9i2bRt5eXlkZGSUmNW/fv33N4QVFhby5Zdf0rp165o4nYQQz330DwJXmtnV\nZnaSmf2RYHx+KoCZTTCzNyMzmFmn8Iz7ZsCRZvaD8Pui/WPMbKCZdTSzzmY2ARgIPFJVJyYiIsmn\nKNB/9913tG7dmp49ezJnzhy+/fZbTjnlFHbu3Ent2rVp1qwZ+fn53HXXXezYsaNEGUuXLmXmzJns\n27ePyZMnU79+fc4444waOqOaV+4Yvbu/YGZNgduBVsBy4IKIe+hbAh2isr0KHFNUBPBB+N+08LY6\nwEQgk+Ce+xXhMqv1wcJF94rqfk4RkcTUsWNHGjVqRK9evQBIT0/n2GOP5eijj8bM6NevH/369eP4\n44/niCOO4Fe/+hXt2rUrzm9mDBw4kOeff56RI0fSsWNH/vGPf5CWlhbrkCnPEj3omZlXVR0jA33k\nAhFF99En+mchNS/43jgQ/i4V/xSoju9QWcfUd1diMTPc/YCVcEr7m5pKC+aMHz+e1atX89RTTx2S\n4yWSWL/zGpl1L4lNPR8ih5dU+r+eSudSVfRQGxERSRlmdkh6KJKJWvQiIpIy7rzzzpquQsJRi15E\nRCSFKdCLiIikMAV6ERGRFKYxehGRw4wmqx1eFOhFRA4jpd1nLalNXfciIiIpTIFeREQkhSnQi4iI\npDAFehERkRSmQC8iIpLCFOhFRERSWFyB3syuN7PPzWyPmS0xs55lpK1nZk+Y2TIzyzezeTHS9Taz\npeEy15jZtRU9CRERESlduYHezC4DJgN3A12BRcDrZtY2RpY0YA/wMPAqweOzo8tsD7wGLAyXOQF4\n2MwGVeAcREREJIZ4Fsy5GZju7o+F399kZv2A64DbohO7++7wPsysK9C4lDJHAV+6++jw+5Vm9kPg\nN8A/Du4UREREJJYyW/RmVhc4FciJ2pUD9KjEcc+MUWY3M0urRLkiIiISobyu+2YEXfGbo7ZvAVpW\n4rgtSilzM0EPQ7NKlCsiIiIRkmKt++zs7OKfQ6EQoVCoxuoiIpKIcnNzyc3NrelqSAIqL9BvBfYT\ntMAjtQA2VeK4X3Fgj0ALYF/4mCVEBnoRETlQdCNo/PjxNVcZSShldt27ez6wFDgvaldfgtn3FfV2\nuIzoMt9z9/2VKFdEREQixHMf/YPAlWZ2tZmdZGZ/JGiNTwUwswlm9mZkBjPrFJ5x3ww40sx+EH5f\nZCrQxswmhcv8OTASuL8qTkpEREQC5Y7Ru/sLZtYUuB1oBSwHLnD39eEkLYEOUdleBY4pKgL4IPxv\nWrjML8zsAmASwa14G4Ab3X1m5U5HREREIsU1Gc/dHwUejbHvZ6Vsax9HmW8Bp8VzfBEREakYrXUv\nIiKSwpLi9joRSS5mVvyz+wGrYIvIIaQWvYhUj+yaroCIgFr0IjVGrV4RORTUohepSfNKfYqziEiV\nUaAXERFJYQr0EpOZleheFhGR5KNALzFp1Fgk8emCXMqjQC/l0h8REZHkpUAvIiKSwhToRUREUpgC\nvcRF44AiIslJgV7iMg/d7y0ikowU6EWkSqnnRySxxBXozex6M/vczPaY2RIz61lO+pPNbL6Z7Taz\nL81sXNT+kJkVlvI6vjInIyIiIiWVu9a9mV0GTAauAxYCvwReN7NO7r6+lPTpwD+BXKAbcBIw3cx2\nufuDUck7Ad9GvN9akZMQERGR0sXTor8ZmO7uj7n7Sne/CdhEEPhLMxyoD4x090/c/UXg3nA50b52\n9y0Rr8KKnIRIslN3t4hUlzIDvZnVBU4FcqJ25QA9YmQ7E1jg7nuj0rc2s2Oi0i4xs41m9qaZheKv\ntmaBixwK+n8mkvzKa9E3A9KAzVHbtwAtY+RpWUr6zRH7ADYCo4BB4ddK4F/ljf2LiJRFFyYiB6qO\n59GXu0S6u68CVkVsesfMsoBbCOYBlJCdnV38cygUIhQKVbKKIjVLwUiqWm5uLrm5uTVdDUlA5QX6\nrcB+oEXU9hYE4/Sl+YoDW/stIvbF8i5wWWk7IgO9SLIqCu7uelyQVL3oRtD48eNrrjKSUMrsunf3\nfGApcF7Urr7AohjZ3gZ6mVm9qPQb3H1tGYfrStClLyIiIlUknln3DwJXmtnVZnaSmf2RoMU+FcDM\nJpjZmxHp/wbsBp4ws85mNgi4NVwO4TxjzGygmXUMp5kADAQeqaLzEhEREeIYo3f3F8ysKXA70ApY\nDlwQcQ99S6BDRPodZtYXmAIsIbhP/n53nxRRbB1gIpAJ7AFWhMucU/lTEpHqoKEHkeQU12Q8d38U\neDTGvp+Vsm0F0LuM8iYSBHoRERGpRlrrXkREJIVVx+11IiI1KvL2RQ01yOEuZVr0WihDREqYp0cr\ni4Ba9CIiSUM9FVIRCvRSrCp7RCrzB0mzu0XK4oB6LyV+KdN1L1WlCoOruk4lAWhITw53Sd+i139i\nERGR2FKjRa+WoyQRXZyKyKGUGoFeJJlk13QFRORwknRd92oNiUg0/V0QiS1JW/SajS0i0fR3QaQ0\nSRroRUREJB4K9CIiIiksrkBvZteb2edmtsfMlphZz3LSn2xm881st5l9aWbjSknT28yWhstcY2bX\nVvQkREREpHTlBnozuwyYDNwNdAUWAa+bWdsY6dOBfwKbgG7AaOAWM7s5Ik174DVgYbjMCcDDZjao\nUmcjItXuYJ8rUZRWz6MQqRnxtOhvBqa7+2PuvtLdbyII4tfFSD8cqA+MdPdP3P1F4N5wOUVGAV+6\n++hwmX8BngR+U+EzOYRyc3MPab6aylvxI1ZcjZxnZY5Z4ZwVV5ljVsVnNI9Ds25FZX4vNXHMZPu7\nIIePMgO9mdUFTgVyonblAD1iZDsTWODue6PStzazYyLSlFZmNzNLi6fiNSnZ/kPXdKA/mFZcZF0P\ntgWoQB9H3jjPNfqzT7agWxPHTLa/C3L4KK9F3wxIAzZHbd8CtIyRp2Up6TdH7ANoESNN7fAxRSrc\nclQXcVX5/na18ePH12A9RKQyqmPWvW5mPQxVJrhWNG9RvqK848ePV4CPEv3ZFn1GB/s53VkF9RCR\nmmFlPQo03HW/CxgSHmsv2j4F6OTufUrJ8yTQ1N37R2w7HVgMtHf3tWY2H1ju7jdEpBkMPAM0cPf9\nEdt14SAiUgHurissKXsJXHfPN7OlwHnAixG7+gIzYmR7G7jXzOpFjNP3BTa4+9qIND+NytcXeC8y\nyIfroC+qiIhIBcXTdf8gcKWZXW1mJ5nZHwnG2qcCmNkEM3szIv3fgN3AE2bWOXzL3K3hcopMBdqY\n2aRwmT8HRgL3V8E5iYiISFi5D7Vx9xfMrClwO9AKWA5c4O7rw0laAh0i0u8ws77AFGAJ8C1wv7tP\nikjzhZldAEwiuE1vA3Cju8+smtMSERERKGeMPtmZWbvI9+6+rqbqUlFm1gUIEfS+LHT39ytRVqa7\nf1lVdUt2ZnYWsNTdv6uCstoRrA1ReKiOWZ3C51Pg7psitrUGapf3/6i0vBHbY35GkZ9Non5OZtYd\neN/d98WZ/jSCxtGFHPxE5Tnuvvsg82BmmcCm6GFQOXwlRaA3szuJ8Z/E3e8ys18STAC8KypfYcmk\nXqF79M3sEeBOd/8mzvRHEvxB3BZ+X4tgMaCewPvAPe6eH0c51wK/J7h1uh5wDnC3u98bR94NwHzg\ncXd/08xOAWa7e2ZUuqLP1gg+o7sOLK3U8ju4+2fxpC0rffizaQNkANuAjXEEy57AF+7+pZm1IZjk\nuTDeukSUsxP4wcGcRxllFQIfA79097eq+5hm9n/A8bG+02Y2CZhJcHFY5ucZI38h8Km7d4rY9inQ\nsbz/R6Xljdge8zOK/Gzi/ZzM7ELgWOA5d/8qztMrytsSGEcwX+go4P8BE939r+WcW0t33xLnMXYC\nPwBWH0zdCP5PdqzI96Qqv9eSGpLlefSDOTDQW3jbXcAgguGDEkHK3St8+2BU63c4cB/wjZlFD12U\n5q/AR0B2+P3NwFjgZeDnBGsF3BCdycyau/vXEZtGA/9V9AfMzHoB/yBYabA8dwAnA8+F69yNYNnh\naO2p2C2Ri8JzM6YDc72UK0YL7qn6EfCz8L8tw9sbApcCQ4GzgIYR2Xab2b+BZ4EXYrRo6hLM+bgU\neAD4UwXqX9WuIvgs7we6H4LjTQGalrG/AcFnWM/MXgFeAt5w9z1xln8VwYVXpLEEF2QVyVu0vco+\nIzP7HcGF8BZgrJn1dfeP4sx7BsGF0J+BXgSrfZ4K/K+Z1Q2v1hnLPWYWT0vbCL6rRVq5e/T6IbHq\ntzOedCLxSIoWfUVF9wTE21oN590FbCVY2/8ioK+7L4znatnMPgNGuPu/w+8/JmjFP1P0B8bdW5WS\n73PgLnefHn6/BLjF3eeF318bfn9cKXkbEPw+d0dtHwY8TXCbZHt33xrvZ1AWM8sAbgGuIQgq7wPr\ngP8AjYB2wCnAd8A0gpbSdjMbDdwGfE1w4fNeON8OIB04hiAIXAgcTdCD8VApx59CEEwau/svK3gO\nh7zlcyiPGb7QOp3g+zuQIMi+SRD0Z0ddVNa4g23Rm9k6YJy7P2lmtxFcGF8B/B9B4G5OKUMN4e/u\np8Av3H121L7OwGvufoyZPQ/cEPk5mVku8V8YFzVGhgP3EMxDiiuAm9nU8Lkd9O9ILXqJluqB/glK\ndkv/7CDy1iG4wu8J/A+QT7B6XxbBH5QXo6/OzaxoKbeiLvqioNubIKDtDtflbIJudSLXIgh3Q08h\nCJS/IFhBcAZQh6D3ZR/BBcQbpdT3HwRjetMitvUE5gC/I2g5fx1+VkGVCa+1cG74nDqG676doKty\nAfBm5DCFmb0I/E88cw3C45u3ufvFEduKPuNGBL+f94GdUPKzjFFe5IWfEVxwPArkhbfFPXRRUWY2\nHHjZ3f9TnceJcezj+D7o/xB4lyDoP+vuGw51faJVIND/B+ji7l+E399O0KvnBBeLz1DKUEM43X+5\n+6VmtoKgRynyNt5jCHqfxgGF7j66qs7xUFCgl2gpHegrw8waFHVzmlkecBrBXQdvAiuAzsB6dz+h\nlLxrCQLyW2bWH3jQ3Y8P78sA1rl7zC5QM7uYYKjgL8BDBGOQtYCVsbpezWwL0Nvd/y/8vku4rjd5\ncOfEWcDz0WP0ySrcot8BpMfboo+48IPgD/swYBbBhcJBXwwmMzNrTtBrMgD4t7tPrOEqVSTQv0/Q\n6n01Yltrgv+nnxAMXTV099yofIuBCe7+kpldSXAh/HuCXq8xBE/VvBM4jmCOQ/OqO8vqp0AvB3B3\nvUp5AXsJVvObRNAS7xzevpMg8NYDesXI+ySwkqBFsJqg+7lo39nAkjiOn0Ew9vwB0D2O9P8BTgz/\n3J6g+/KciP0dgd01/blW0e+mDzAj/PPzQKiC5ewEjq3p89GrxO+jQ/TPZaS/AfhHBY6zOeL/81Lg\nRxH7mhJcQNYjuPjLJ5h8V+OfT0U+R730cvdqWes+VWQSdNnvJXiwz/tmtpDgD8CpBK2/BTHy/pqg\nW/RSglb1PRH7fkowZl4qM/uJmf2aILhfC9wIPGZmk83siDLq+wHwRzO7DniLYDxwbsT+nxDMKk4F\nBXz/2ONfA5W5jUhdWknK3R9x90EVyJrP95MKWxG05It8R9CVn07Qi1aLyn2/aoJWE5US1HUfh3DX\n/dnASQQz6r8iGD9/1917V+FxHgAuB+YRTKJ60oPbB+sRjCcPA8Z4RFdlRN7TCMbz9xN0R/8UGA+s\nIWgB30YQ/MuaTXxYURdnYqnI7XUVPM4bBPMk/jc8nHMCwYXjboJu/JPcvauZdSWYY5JUT9S0YL2C\nDa776CVMgT4OZraNYPLOuvAfoK4ErYCQuz9Xhcf5FjjP3ZeY2VHAYnfvGLG/E/And+8VR1lXENyG\n14LgD9gkdx9XVXVNBeGZ2o+6e165iaXaRf4+qvN3Y2ZXEVww/5eZpRPcojmA4Fa4t4DRHqze+ShQ\nK9yzJpK0FOjjEHmFHL5Vrp+XfR99RY+zHrjZ3WeEWxN/dff/ikpjHucvzYLFaI4G8vz7BwyJHNbC\nd9S8B7zi7rfHSDMAeBzo6lpNUpKcAn0CCd969WeC29MaAiPd/aWarZVI6jGzLOANgtszf+/un4S3\ntyCYFzMKuMSjZuyLJCMF+gRjZs0IVvn7f+pSFqk+ZtYI+C3BIjsZBBNv6xAs5DTew/fniyQ7BXoR\nOeyZWROCIL/VK/BsAJFEpkAvIiKSwnQfvYiISApToBcREUlhCvQiIiIpTIFeREQkhf1/BmYtWNLZ\nUuUAAAAASUVORK5CYII=\n",
697 "<matplotlib.figure.Figure at 0x7ff441c59668>"
701 "output_type": "display_data"
705 "punctuation_normalised = punctuation.div(punctuation.sum())\n",
706 "ax = punctuation_normalised.plot(kind='bar', fontsize=14)\n",
707 "ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))"
711 "cell_type": "markdown",
714 "Too many types of punctuation with very low counts. Let's just look at the most common punctuation."
719 "execution_count": 108,
727 "<matplotlib.legend.Legend at 0x7ff441e45630>"
730 "execution_count": 108,
732 "output_type": "execute_result"
736 "image/png": "iVBORw0KGgoAAAANSUhEUgAAAegAAAD/CAYAAAA69EWbAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X10VNX97/H3TsKTShIeQgwkMQGiCNhCa1HRNAkVGvrj\nqQqWIAEq1eu1omhrlVsxscsfPiCV4mVVuS2gggVpCwREjL+SUSz1AVrQUCErUAMkSEVDghGSQM79\nI8k0hDxMZuZkziSf11pnrcmcc/Z8Z2dmvmfvs8/ZxrIsRERExFlCAh2AiIiIXEwJWkRExIGUoEVE\nRBxICVpERMSBlKBFREQcSAlaRETEgcLsKtgYo+u3RES8YFmWCXQMEni2tqAty3LUkpWVFfAYFFPH\niksxKSZ/LyL11MUtIiLiQErQIiIiDtSpEnRqamqgQ7iIYvKcE+NSTJ5RTCJtZ+w652GMsXQ+RUSk\nbYwxWBokJnSyFrSIiEiwsO0yK6g9Euyo1DsgIiJ2sjVBk5dna/EBk5YW6AhERKSDUxe3iIiIA9k6\nSMyWgh1CXdwiYgcNEpN6tnZxZ2VluR+npqbqsgYRkUZcLhculyvQYYgD6TIrEREHUQta6ukctIiI\niAMF9WVWaqGLiEhHZe9lVtiZQNUDJCIiHZfNCVpJVERExBttStDGmHjgnzTfNL7asqxj9X+oC1pE\nRMQ7bRrFbYwJBa5oYZMiy7LO122rUdwiIm2kUdxST5dZiYg4iBK01NNlViIiIg6kBC0iIuJAStAi\nIiIOpAQtIiLiQErQIiIiDqQELSIi4kBK0CIiIg6kBC0iIuJAbU7Qxph7jTF/N8Z8ZYw5Yox5pIVt\n/bqIiIh0Ft5MljEGWAjsB1KA3xlj9luWtaXxhnnk+Rjef6SR5reyREREnM7nW30aY4qBxZZlLW30\nfLve51O3FRWRjkC3+pR6Pk03aYz5P3VlrGtygzz/taBblKbWtYiIdCxet6CNMY8C84CxlmV91MR6\ntaBFRNpILWip51UL2hjTH/gVML6p5Fwvq8HjVCANJVIRkYZcLhculyvQYYgDedWCNsZ8G/gASLQs\n60gz2zg2E+sgQUScSi1oqeftOeh/AqOA4y1ule1l6XbKDnQAIiIirfO2BT0KeBkYY1lWSTPbOLaZ\nqha0iDiVWtBSz9sW9CVAUuv7e5MIjRKoiIh0ej5fB91swcZYStAiIm2jFrTU8+k66NbpMyYiIuIN\nWxO0WsIiIiLe0WxWIiIiDqQELSIi4kBK0CIiIg6kBC0iIuJAStAiIiIOpAQtIiLiQErQIiIiDqQE\nLSIi4kBK0CIiIg5k653EjGn+Vp+6y5iIiEjzbE3QeeQ1+XwaaXa+rIiISNCzeTar5qkFLSJyMc1m\nJfV8PgdtjJljjKkxxsQ3XmdZVrOLiIiINM8fXdyngANAdeMVOgctIiLiHZ8TtGVZm4BNTa3LavA4\ntW4BzRItIlLP5XLhcrkCHYY4kK3noJsr2aAWtIhIU3QOWurZe5lVS+ta6P7uKHQQIiIi3rI1QZNt\na+nOlh3oAEREJJj53MVtjJkDrAQSLMs60uD5Tt98VAtaRNpKXdxSzx8t6ERgP3Ds4lXBmKCMEquI\niAScPxL0eOCnlmXVXLxKB4EiIiLesHcUt1qiIiJt0lwXt04bdlzNndKwd5CYiIj4jRo9HU9LVzRp\nukkREREHUoIWERFxICVoERERB1KCFhERcSANEhMRCULtcbtkDUoLLCVoEZGgZWcC1X0sAk1d3CIi\n4rWEhASeeuophg0bRu/evbnjjjuorKyktLSUCRMm0K9fP3r37s3EiRMpLi5275eamsqCBQu47rrr\niIiIYMqUKZSWlgbwnTiPErSIiPjk1VdfJTc3l0OHDlFQUMATTzyBZVnMnTuXI0eOcOTIEXr06MG9\n9957wX6vvPIKq1at4vjx44SFhXHfffcF6B04k613Emtunc5riIg0raU7iTX87aw9B21vF7cnv9WJ\niYksWLCAu+66C4A33niDefPmUVhYeMF2e/fuZcyYMXz55ZcApKWlccMNN7Bo0SIAPvnkE0aMGMHZ\ns2c7xXTE9VqaHMXWc9B55F30XBppdr6kiIi0s7i4OPfj+Ph4SkpKOHPmDPPnz+fNN990d11/9dVX\nWJblTsCN96uurubkyZNERUW17xtwKFsTdHPJWKMPRUQ6jiNHjlzwuH///ixZsoSCggI++OAD+vXr\nx969e/nWt751QYJuvF+XLl3o27dvu8fvVPbOB513cQu6XaSlKUGLSFAKti7uhIQEIiIi2LZtGz16\n9GDSpEmkpqZSXV3Nxx9/zMaNG6moqGDu3Lls3ryZc+fOERISQmpqKocOHSI3N5crrriC2bNn061b\nN9asWWPje3Kelrq4/TFI7BRwAKj2Q1kiIhJEjDHMmDGDcePGMWjQIJKSknj00UeZP38+Z86coW/f\nvowePZrx48df0HtqjCEzM5M5c+YQExNDVVUVy5YtC+A7cZ6ADBJrD2pBi0gwalsL2l6eDhL7/e9/\nz5gxY9pUdlpaGpmZmdxxxx3ehtchBGyQWBbwOJCVlUVqaiqpqal2vpyISNBxuVy4XK4279cRGiEd\n4T3YqcO2oFujD4aIOJGnLWinUAvaNy21oP0xSOyHwJPAGMuySho8b5HtU9H2yVaCFhFnCrYELb6x\nu4s7AkhqsqxsP5QuIiLSCdncxd2wbM+G7IuIdGZqQXcuARskptlQREREvGNrgtbRnoiIiHc0m5WI\niIgDKUGLiIjfZWdnk5mZ6dW+q1evJjk52c8RBR+bz0GLiIgdnHInseZ0pikj7aIELSISrLKdW3Zn\nHYN0/vx5QkND/VKWurhFRMQnTz/9NLGxsYSHhzNkyBB27NiBMYaqqipmz55NeHg4w4cPZ8+ePe59\nnnrqKQYPHkx4eDjDhg1j06ZNzZb/0EMPkZyczOnTpykrK2Pu3Ln079+f2NhYFi5cSE1NDQCFhYWk\npKQQGRlJVFQU06dPd5cREhLC888/z6BBg4iKiuIXv/jFBQcRK1euZOjQofTu3Zv09PQLpsK8//77\niY+PJyIigmuvvZZ3333XvS47O5upU6eSmZlJREQEL730UosxtoUStIiIeO3gwYMsX76c3bt3U15e\nTm5uLgkJCViWRU5ODhkZGZSVlTFp0iTuvfde936DBw/m3Xffpby8nKysLGbOnMmJEycuKNuyLO68\n807y8/N566236NmzJ3PmzKFr164cOnSIf/zjH+Tm5vK73/0OgIULF5Kens6pU6coLi7mvvvuu6C8\nTZs2sWfPHv7+97+zefNmVq5cCcDmzZt58skn2bhxIydPniQ5OZmMjAz3fqNGjWLfvn2UlpYyY8YM\npk2bRlVVlXt9Tk4O06ZNo6ysjBkzZrQYY1soQYuIiNdCQ0OprKxk//79VFdXEx8fz8CBAwFITk4m\nPT0dYwwzZ85k37597v2mTp3K5ZdfDsBtt91GUlIS77//vnt9dXU106dP59SpU2zZsoXu3btz4sQJ\n3njjDZ577jl69OhBVFQU8+fPZ926dQB07dqVTz/9lOLiYrp27cro0aMviPXhhx8mMjKSuLg45s+f\nzx/+8AcAXnjhBRYsWMBVV11FSEgICxYsYO/evRw9ehSA22+/nV69ehESEsKDDz5IZWUlBw8edJc7\nevRoJk2aBEBZWVmLMbaFErSIiHht8ODBLF26lOzsbKKjo8nIyOD48eMAREdHu7e75JJLOHv2rLur\n9+WXX2bkyJH06tWLXr16kZ+fzxdffOHevrCwkC1btvDYY48RFlY7XKqoqIjq6mpiYmLc+9199918\n/vnnADzzzDNYlsWoUaMYPnw4q1atuiDWuLg49+P4+HhKSkrc5d5///3uMvv06QNAcXExAM8++yxD\nhw4lMjKSXr16UVZWxsmTJ91lxcbGuh+3FmNbKEGLiIhPMjIy2LlzJ0VFRRhjePjhh1scxV1UVMRd\nd93F8uXL+fLLLyktLWX48OEXnBO++uqrWblyJePHj6egoACoTbDdunXjiy++oLS0lNLSUsrKyvj4\n44+B2gOCFStWUFxczIsvvsg999zD4cOH3WU2PK985MgRBgwYANQm6xUrVrjLLC0tpaKiguuvv56d\nO3eyePFiNmzYwKlTpygtLSUiIoLm5uZuLca2sDVBG2O0aNHSyiISzAoKCtixYweVlZV069aN7t27\ntzqKuaKiAmMMffv2paamhlWrVpGfn3/RdtOnT2fRokXcfPPNHD58mJiYGMaNG8eDDz7I6dOnqamp\n4dChQ7zzzjsAbNiwgWPHjgEQGRmJMYaQkP+kuWeffZZTp05x9OhRli1bxo9+9CMA7r77bhYtWsQ/\n//lPoLabesOGDQCcPn2asLAw+vbtS1VVFb/61a8oLy9v9r21FmNb2HqZVR55dhYvEvTSSAt0CCI+\nqaysZMGCBXzyySd06dKFG2+8kRUrVvDiiy9edABa//fQoUP52c9+xg033EBISAizZs3ipptuumC7\n+m1nzZpFVVUVY8aM4Z133uHll1/mkUceYejQoZw+fZqBAwfyyCOPALB7924eeOABysrKiI6OZtmy\nZSQkJLjLnTx5Mt/+9rcpKyvjxz/+sXsu6ilTpvDVV18xffp0ioqKiIiIYNy4cUybNo309HTS09O5\n8sorufTSS3nggQeIj49vMtZ6LcXYFrbOZqUELdKyNNI67fWi0jRjPJvNqj16XzrSZzMkJITCwkL3\nADanaO7/DTa3oNU6EGldR+3m7kg/7k6k+u347L2TWJ5a0CKdUpoOzsVZgvFA2NYublsKFpGgoBae\ndzzt4paOIWBd3FkNHqfWLSLBzKDEI/7lcrlwuVyBDkMcSC1okTZSghY7qQXduQSsBW3rTCsS3LKV\n6EREWuJzC9oYcy/wU8uyrm70vH59pVPTAYh4Qy3ozsXuFnQf4MqmV+nDJJ1V8I0YFRFn8flWn5Zl\nPW5Zln9mpxYRkQ4hOzubzMxMR5W3evVqkpOT/RSR/ew9B61WhIiILZx+JzF/xxeM1zH7ytYErfMl\nIiL2sfMX1td06M/f/3PnzvmtrGCi6SZFRMQnTz/9NLGxsYSHhzNkyBB27NiBMYaqqipmz55NeHg4\nw4cPZ8+ePe59SkpKuPXWW+nXrx8DBw7k+eefd6/Lzs5m6tSpZGZmEhERwUsvvXTRa+bk5DBs2DB6\n9epFWloaBw4ccK87evQot9xyC/369aNv377MmzevybgfeughkpOTW5ydKpCUoEVExGsHDx5k+fLl\n7N69m/LycnJzc0lISMCyLHJycsjIyKCsrIxJkyZx7733AlBTU8PEiRMZOXIkJSUl/OUvf2Hp0qXk\n5ua6y83JyWHatGmUlZVx++23X/CaBQUFzJgxg2XLlnHy5El+8IMfMHHiRM6dO8f58+eZMGECiYmJ\nFBUVUVxcTEZGxgX7W5bFnXfeSX5+Pm+99Rbh4eH2V5QXlKBFRMRroaGhVFZWsn//fqqrq4mPj3fP\nGJWcnEx6ejrGGGbOnMm+ffsA+PDDDzl58iSPPvooYWFhJCYm8pOf/IR169a5yx09ejSTJk0CoHv3\n7hd0ma9fv54JEybwve99j9DQUH7+859z5swZ/vrXv/LBBx9w/PhxFi9eTI8ePejWrRujR49271td\nXc306dM5deoUW7ZsoXv37u1RTV6xeZCYiIh0ZIMHD2bp0qVkZ2ezf/9+vv/97/PrX/8agOjoaPd2\nl1xyCWfPnqWmpoaioiJKSkro1auXe/358+f57ne/6/47Nja22dcsKSm5aE7muLg4iouL6dKlC1dc\ncQUhIU23PwsLC/noo494//33CQtzdgpUC1pERHySkZHBzp07KSoqwhjDww8/3OKo67i4OBITEykt\nLXUv5eXlbN26FahNuC3tP2DAAIqKitx/W5bF0aNHiY2NJS4ujiNHjnD+/Pkm97366qtZuXIl48eP\np6CgwMt33D6UoEVExGsFBQXs2LGDyspKunXrRvfu3QkNbfnWGKNGjaJnz54888wznDlzhvPnz5Of\nn8/u3buB1keAT5s2jddff50dO3ZQXV3NkiVL6N69O6NHj+Y73/kOMTExPPLII3z99decPXuWXbt2\nXbD/9OnTWbRoETfffDOHDx/2rQJspAQtIiJeq6ysZMGCBURFRRETE8PJkyd58skngYuvXa7/OzQ0\nlK1bt7J3714GDhxIVFQUd911l3s0dVMt6IbPXXXVVaxZs4Z58+YRFRXF66+/zpYtWwgLCyM0NJQt\nW7ZQWFhIfHw8cXFxvPbaaxeVMWvWLB577DHGjBnDkSNH7KsgH9g6m5WugxYRaRtP78Xt9BuViGcC\nN5uViIjYQsmz47M1QQfDrdn0IRcRESeyNUHnkWdn8T5LIy3QIYiIiDTJ1nPQthTsYGqNi4ivNB90\n5xK4c9B5zm5B+1WaWuMiIuI/akH7kY5uRcRXakF3Lra0oI0xtwMvNHgq3bKsvzbcJqvB49S6JVAM\nSqAi4jwulwuXyxXoMMSBvG5BG2MuA/o1eKrEsqyzDdY7LhsqQYuI06kF3bnY0oK2LOsr4KsWN8r2\ntvT/7K8PpIhIcHK5XGRmZnL06NFAhxKUdKMSEZEgpDuJdXz2Juhs34vw54dQHzYR6VDsvFJGV6YE\nnM2TZVgOWkRExN9CQkIumBFqzpw5LFy48KLtFi9ezNSpUy947r777mP+/PkArF69mkGDBhEeHs7A\ngQN59dVXgdr5m1NSUoiMjCQqKorp06e79z9w4ABjx46lT58+DBkyhA0bNrjXbdu2jWHDhhEeHk5s\nbCxLlizx6/tuDzZ3cTv/Vp8iIuI/zc3lPHPmTB5//HHKysqIiIjg3LlzrF+/nu3bt1NRUcH999/P\n7t27SUpK4sSJE3zxxRcALFy4kPT0dN5++22qqqrcU1JWVFQwduxYnnjiCd58800++ugjxo4dyzXX\nXMOQIUOYO3cuf/zjH7nxxhspKytz9LSSzbG1BW1ZlqMWERGxX1O/tzExMSQnJ7tbudu3b6dv376M\nHDkSqG2Jf/zxx5w5c4bo6GiGDh0KQNeuXfn0008pLi6ma9eujB49GoCtW7eSmJjI7NmzCQkJYcSI\nEdxyyy3uqSW7du3K/v37KS8vJyIiwv06wUTzQYuISLuYPXs2a9asAWDNmjXMmjULgEsvvZT169fz\nwgsv0L9/fyZMmMDBgwcBeOaZZ7Asi1GjRjF8+HBWrVoFQFFREe+//z69evVyL6+++ionTpwA4E9/\n+hPbtm0jISGB1NRU3nvvvQC8Y98oQYuIiNcuueQSvv76a/ffx48fb3Zw7+TJk/noo4/Iz8/n9ddf\n5/bbb3evGzduHLm5uXz22WcMGTKEO++8E4Do6GhWrFhBcXExL774Ivfccw+HDh0iPj6elJQUSktL\n3cvp06dZvnw5ANdeey2bNm3i888/Z8qUKdx222021oI9lKBFRMRrI0aMYO3atZw/f57t27fzzjvv\nNLttjx49uPXWW5kxYwbXXXcdsbGxAPz73/9m8+bNVFRU0KVLFy699FJCQ0MB2LBhA8eOHQMgMjIS\nYwyhoaFMmDCBgoIC1qxZQ3V1NdXV1Xz44YccOHCA6upq1q5dS1lZGaGhofTs2dNdXjBRghYREa/9\n5je/YcuWLe4u5h/+8IcXrG/cmp49ezb5+flkZma6n6upqeG5555jwIAB9OnTh507d/Lb3/4WgN27\nd3P99dfTs2dPJk+ezLJly0hISOCyyy4jNzeXdevWMWDAAGJiYliwYAFVVVVAbRd6YmIiERERrFix\ngrVr19pcE/5n62QZGpglItI2nt7qM1hvVHL06FGGDBnCiRMnuOyyy/xefrAJ3HSTIiJii2BsANXU\n1LBkyRIyMjKUnD2gBC0iIrarqKggOjqaxMREtm/fHuhwgoK6uEVEHESzWXUuLXVxa5CYiIiIA9na\nxd0egxhEJLioFSjiGVsTdB42zrQiIkEnDc2QJOIpW89B21KwiAQ1taBbpnPQnUvgLrOyc65SkWCS\nlqbEJCJtokFiIiIiDqQubhEJGPUqXKyj30lMLhSwLu6sBo9T6xYREQBd41HL5XLhcrm82tfOgbga\n0Bd4akGLiLSgvVuRbWlB252gPXnvq1atYuPGjeTk5ACQlJTEyJEjee211wCIi4tj69atrFy5ko0b\nN1JWVkZSUhJLly7lpptuAiA7O5v8/HzCwsLYtm0bSUlJrFq1im984xu2vT+nCNwgsWxbSxcRsVd2\noANwvtTUVB588EEASkpKqK6u5r333gPg8OHDVFRU8M1vfpNRo0aRnZ1NREQES5cuZdq0aRQVFdG1\na1cAcnJyWLduHWvXrmXp0qVMmTKFgoICwsI67x2pfW5BG2PmACuBBMuyjjR4Xi1oEQl6akG3Lj4+\nns2bN3Pw4EHy8vLYt28fL730Ert27WLz5s1s2rTpon169+7N22+/zTXXXEN2dja5ubns2rULqK3z\nAQMG8Nprr7lb2R2V3S3oRGA/cOziVcrRIuJPRgOXHCglJQWXy0VhYSEpKSlERkby9ttv87e//Y2U\nlBQAnn32WVauXElJSQnGGMrLyzl58qS7jNjYWPdjYwyxsbEcP3683d+Lk/gjQY8HfmpZVs3FqzQM\nRESko0tJSSEnJ4dPP/2UX/7yl0RGRrJmzRree+895s2bx86dO1m8eDE7duxg2LBhQG0LuuHB1tGj\nR92Pa2pqOHbsGP3792/39+IkPidoy7JGtbDO1+JFRMThUlJSeOCBB4iJiaF///5cdtllzJw5k5qa\nGkaOHMkbb7xBWFgYffv2paqqiqeeeory8vILytizZw8bN25k4sSJLFu2jO7du3P99dcH6B05g25U\nIiIiPklKSqJnz54kJycDEB4ezqBBg7jxxhsxxpCenk56ejpXXnklCQkJ9OjRg/j4ePf+xhgmT57M\n+vXr6d27N2vXruXPf/4zoaGhgXpLjqD5oEVEHKQz3qjk8ccfp7CwkFdeeaVdXs9JAneZlYiI2KIj\nNYA60nvxJ3Vxi4hIQBlj2qVHINioi1tExEE03WTn0lIXt1rQIiIiDqQELSIi4kBK0CIiIg6kBC0i\nIuJAtl5m1RFG5WlQhog4RUf4TRXP+SVBG2NWA/+yLOvxhs/bOdNKe9CE5SLiFM2N9JWOy19d3Baa\nukpERMRv/NnFfdHRXUdogTbVpaRubxERsZs/E/TFWSsvuLu4m5QW/AcdIiLifLbeScyWgh1ELWkR\n8beW7iwlnYuto7izGjxOrVs6Cn17RMQfXC4XLpcr0GGIA6kF3Ump9S/iTGpBSz17p5vMtrV08VZ2\noAMQEZHWqAXdSakFLeJMakFLPXtb0LZcGm2UXEREpMPTvbhFREQcyOYWtHppREREvGFrglZXtIiI\niHfUxS0iIuJAStAiIiIOpAQtIiLiQErQIiIiDqQELSIi4kBK0CIiIg6kBC0iIuJAStAiIiIOpAQt\nIiLiQF7fScwYsxr4l2VZjxtjaoAEy7KONNrGx/DEiXSHOBER+/lyq0+LVqaryiPPh+LFidJIC3QI\nIiKdgi8JutXmsX7MRUREvONrC7qpx/95Ul2hIiIiXjF2JVFjjLKzdEg68BQ7GWOwLEsDeMTe6Saz\nGjxOrVtEgpl+NcXfXC4XLpcr0GGIA6kFLdJGakGLndSClnq2tqD1QyYiIuId3ahERETEgWxtQetG\nJcFBPR0iIs5ja4Ju5T4m4gg6iBIRcSKbE7R+/EVERLyhQWIiIiIOpEFiIiIiDqQELSIi4kBK0CIi\nIg6kBC0iIuJAStAiIiIOpAQtIiLiQErQIiIiDqQELSIi4kBK0CIiIg7kc4I2xuQbY7KaWdcui4iI\nSEfjj1t9WjQzK0YeeX4ovmVppNn+GiIiIu3N1ntxK3mKiIh4x97ZrPL81IJOS9PEGyIi0qlokJiI\niIgD+aMF3fworTR1cYuIiHjDX4PEmtRwaHdq3WLQPNEiIvVcLhculyvQYYgDGbuSpTHG8VlYBwoi\n4jTGGCzL0vWj4nsL2hjzF+DPlmUtv2hltq+l2yg70AGIiIg0zx9d3AOBPk2uyfZD6SIiIp2QzV3c\n3pZt1P0sIp2Surilnr3XQbcwwFtERESaZ2uCVitYRETEO53qRiVOvJRBMXnOiXEpJs8oJpG2U4IO\nMMXkOSfGpZg8o5hE2q5TJWgREZFgoQQtIiLiQJ36TmIiIk6ky6wEbEzQIiIi4j11cYuIiDiQErSI\niIgDKUGLiIg4kEcJ2hhzjzHmX8aYM8aY3caYm1rZ/hpjzNvGmK+NMceMMQub2CbFGLOnrsxDxpj/\n1dbg/R2XMSbVGFPTxHKlHTEZY7oZY1YbY/YZY6qMMXnNbOdTXfk7pgDUU6oxZrMxpsQYU1EX24+b\n2K4966nVmAJQT0ONMXnGmM8a1MF/G2O6NNquXb97nsTV3nXVaL8kY8xpY8zpJta122fKk5j8UU8S\nRCzLanEBfgRUAXOBq4BlwGkgrpntw4HPgHXAUOBWoBx4sME2iUAF8Ju6Mn9S9xq3tBaPzXGlAjXA\nEKBfgyXEppguAX5b9/43Ajua2ManurIppvaupwXAr4AbgATgbqAayAhgPXkSU3vX0yBgFnANEAdM\nrPvMLw7wd8+TuNq1rhrs1xXYA2wFygP53fMwJp/qSUtwLa1vAO8DLzZ6rgBY1Mz2/xs4BXRr8Nwv\ngWMN/n4aONhov/8H7PI4cHviqv/w9/GqMtsYU6Pt/i+Q18TzPtWVTTEFrJ4abL8e+KMT6qmFmJxQ\nT79uWAeB+O55GFdA6gp4Dvg9MBs43WhdQD5TrcTkUz1pCa6lxS5uY0xX4FtAbqNVucDoZna7Adhp\nWVZlo+37G2OuaLBNU2Vea4wJbSkmm+Oqt7uu6/J/jDGprcXjQ0ye8LqubIypXiDrKQL4ssHfTqin\nxjHVC0g9GWMGA99vVEYgvnuexFWv3erKGPNfwH8B82h66r12/0x5EFO9NteTBJ/WzkH3BUKBE42e\n/zdweTP7XN7E9icarAOIbmabsLrXbI1dcZVQ21V5S91yEPiLh+eNvInJE77UlV0xBbSejDETgDHA\nigZPB7Sdg+MPAAADGUlEQVSemokpIPVkjNlljDlDbWvtfcuyshusDsR3z5O42rWujDH9qf1f3W5Z\n1tfNlNuunykPY/KlniTI2DHdpFPvfNJqXJZlFVD741HvPWNMAvAQ8K49YQWfQNaTMeZGYC0wz7Ks\n3Xa+lqeaiymA9XQbcBkwAlhsjHnGsqxf2Ph6nmo2rgDU1SvAby3L+tCGsr3Vakz6jepcWmtBnwTO\nU3sk2VA0cLyZfT7j4iPE6AbrWtrmXN1rtsauuJryAZBkU0ye8KWu7IqpKbbXU10rYRuw0LKsFxut\nDkg9tRJTU2yvJ8uyjlmWdcCyrHXAI8D9DbpkA/Hd8ySupthZV2lAljGm2hhTDfwOuLTu75/UbdPe\nnylPYmqKp/UkQabFBG1ZVhW1ownHNVo1FtjVzG5/A5KNMd0abV9sWVZRg23GNlHmh5ZlnW8taBvj\nasoIaruV7IjJE17XlY0xNcXWejLGfJfaRJhlWdayJjZp93ryIKamtPfnKZTa73n9dz0Q3z1P4mqK\nnXU1HPhmg+Ux4Ezd4z/WbdPenylPYmqKR/UkQai1UWTUdktVUnupwNXUXnJQTt2lAsCTwP802D6c\n2iPEPwDDqD1PUgY80GCbBOArakcrXk3t5QuVwA89Hd1mU1zzgcnUHo0OqyujBphiR0x1zw2l9gu2\nDviQ2i/jCH/VlU0xtWs9UTtytYLaUbXR1LZqLgeiAlVPHsbU3vWUCUyl9hKcgXX7HwPWBPi750lc\n7f7da7T/HC4eMd2unykPY/KpnrQE1+LZRrWXKP0LOEvtD/ZNDdatAg432n448Da1R3/F1Hb/NS7z\nu9QeYZ4FDgF3tTl4P8dF7XmcAuBr4Iu6bdNtjulfdV+wGmq7xGqA8/6sK3/H1N71VPf3+QYx1S+N\n4263evIkpgDU0/S6919O7fW2+dR2JXdrVGa7fvc8iau966qJfefQ6JrjQHz3WovJH/WkJXgWzWYl\nIiLiQLoXt4iIiAMpQYuIiDiQErSIiIgDKUGLiIg4kBK0iIiIAylBi4iIOJAStIiIiAMpQYuIiDiQ\nErSIiIgD/X9zx531pAILaAAAAABJRU5ErkJggg==\n",
738 "<matplotlib.figure.Figure at 0x7ff441e4f9b0>"
742 "output_type": "display_data"
746 "ax = punctuation_normalised[punctuation_normalised.sum(axis=1) > 0.1].plot(kind='barh',fontsize=14)\n",
747 "ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))"
751 "cell_type": "markdown",
754 "## Visualising the punctuation\n",
755 "Let's print the punctuation sets side-by-side to compare them."
760 "execution_count": 83,
767 "output_type": "stream",
769 ".\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".''''''''' ,.\",\",\"'..?\"\".,\",,\",,,.,..?.?\"\",\".\",\",,\"?.\",..\",\",\n",
770 "''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',. .,',.',..,,(),:\".?.\".\",\",--\".?',.',\".\".,'.\".\"',\".\"\n",
771 ".\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.- .\"\"';.?\"(),\"'....\",,..\"?\".\",.,..\",.\"?,\",.\"...'!\",.\n",
772 ",-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\" .\"?\".\",',.,.\",,,.\",\",,\"?,\",\",?\":\"'....?\"\"..-,'.',.\n",
773 "\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\". .;,.--'.\"\",,\",'.\"-,.'.\",',,,.\",\",,\"',',,.''.\"'.:.'\n",
774 "\"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,! ,,';.,,*.,,.',,,..*.,\",\"\"?\",,;,'.,,;.,,,\",,.\",,.-.\n",
775 "-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\", ,,,,.,-----.,,,.,,,,.,,.,,,,,.\",\",.\",,,\",.\",.\"-,-,\n",
776 "!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\" ,.\",,,\".\",\",,\"?.?\",,,.\"!\".,-,,-,,.,-'.,,..-,,,.,,,\n",
777 "\"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\" .\",,,\",.,.,.','.:\"?.\"\",,.\"\"?\"..,.,,'.\",\".,,.,,,,,,\n",
778 "\",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\" -,,,..,.,'.,,,.-..,.'..,,,.,,.,,,',,,..-,..',,,.'.\n",
779 ".',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\" ','.\",!,,\",:\",.\".,.\",\".\",\".\",\";,-..\",,,\",..--.,,,,\n",
780 "\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\" ,,,,----.,.,,.\"!\";.\",,\",...,,,,.,,',..\",'....,?\",.\n",
781 ".\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,. \".\",,.\",\",,.,,.,.',,-,,,,-,.,,,,.\"'?\",,.\",,\",.\",\".\n",
782 ",,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,, -.-,,,,..,,';,',,'.,.,;.\"!\".\"!\",.,,,,.,,'-,.,.\"...\n",
783 ".'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''. ,\".\"----,!\"\"?\".,,.',,.\",,,,\".,,,.:,'.,,,.,,,,,.,.,\n",
784 "',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\" .,',.\",?\".\",\",',\"--....\"\",?\"\".\"\"?\"\",\",,\"!\".,,,.,,'\n",
785 "\"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\". .\"!...,,?\".\",\".\".?\".\",!\",'.,.\",,\",.\"',.,\",.,,,,.,,\n",
786 "..,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\" .\",\".\",\".':\"!..\".'..-.\",?\",.\"'..\",,,.\",?\".\",,,\",\".\n",
787 ",,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\" ..\",,,..,,''.',,;'.\",,\".\",';'--,\".\",',!.,\",.\",,\",.\n",
788 ",,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\" ,,.,,,.',,.;.,----,,,,..\",\",\";',--..?\"\"!--!\".\"--!.\n"
794 "for i in range(5,25):\n",
795 " print(sherlock['punctuation'][line_len*i:line_len*(i+1)], wap['punctuation'][line_len*i:line_len*(i+1)])"
799 "cell_type": "markdown",
802 "Again, now I know it's working, wrap it in a function."
807 "execution_count": 84,
813 "def compare(text1, text2, offset=0, line_len=50, max_lines=30):\n",
814 " for i in range(offset, min(max(len(text1), len(text2)), line_len * max_lines), line_len):\n",
815 " t1 = text1[i:i+line_len]\n",
816 " t1 += (' ' * (line_len - len(t1)))\n",
817 " print(t1, text2[i:i+line_len])"
822 "execution_count": 85,
830 "output_type": "stream",
832 ",,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\",.,\"\"\",\".,.,'.,,,,,\",. !!\",.,,,,.,,.,,,,,.\",,.',\",.\"??\".\",?\"\"'?.,\".\".\"\"'.\n",
833 "\"\";\",,..,,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\"\"\"\",\"\"\"\" .\"\",,\",,-,.\"'!,'?.\"\"?\",.\"?,.\",.,,.,,.,,,,,,,,.:\",'\n",
834 "?'\"\"!...,,.--,,,\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,. .',,,.!..!,.,!....,,?...'..,,.?.-,.?!!,....',...!\"\n",
835 ".\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".''''''''' ,.\",\",\"'..?\"\".,\",,\",,,.,..?.?\"\",\".\",\",,\"?.\",..\",\",\n",
836 "''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',. .,',.',..,,(),:\".?.\".\",\",--\".?',.',\".\".,'.\".\"',\".\"\n",
837 ".\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.- .\"\"';.?\"(),\"'....\",,..\"?\".\",.,..\",.\"?,\",.\"...'!\",.\n",
838 ",-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\" .\"?\".\",',.,.\",,,.\",\",,\"?,\",\",?\":\"'....?\"\"..-,'.',.\n",
839 "\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\". .;,.--'.\"\",,\",'.\"-,.'.\",',,,.\",\",,\"',',,.''.\"'.:.'\n",
840 "\"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,! ,,';.,,*.,,.',,,..*.,\",\"\"?\",,;,'.,,;.,,,\",,.\",,.-.\n",
841 "-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\", ,,,,.,-----.,,,.,,,,.,,.,,,,,.\",\",.\",,,\",.\",.\"-,-,\n",
842 "!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\" ,.\",,,\".\",\",,\"?.?\",,,.\"!\".,-,,-,,.,-'.,,..-,,,.,,,\n",
843 "\"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\" .\",,,\",.,.,.','.:\"?.\"\",,.\"\"?\"..,.,,'.\",\".,,.,,,,,,\n",
844 "\",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\" -,,,..,.,'.,,,.-..,.'..,,,.,,.,,,',,,..-,..',,,.'.\n",
845 ".',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\" ','.\",!,,\",:\",.\".,.\",\".\",\".\",\";,-..\",,,\",..--.,,,,\n",
846 "\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\" ,,,,----.,.,,.\"!\";.\",,\",...,,,,.,,',..\",'....,?\",.\n",
847 ".\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,. \".\",,.\",\",,.,,.,.',,-,,,,-,.,,,,.\"'?\",,.\",,\",.\",\".\n",
848 ",,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,, -.-,,,,..,,';,',,'.,.,;.\"!\".\"!\",.,,,,.,,'-,.,.\"...\n",
849 ".'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''. ,\".\"----,!\"\"?\".,,.',,.\",,,,\".,,,.:,'.,,,.,,,,,.,.,\n",
850 "',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\" .,',.\",?\".\",\",',\"--....\"\",?\"\".\"\"?\"\",\",,\"!\".,,,.,,'\n",
851 "\"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\". .\"!...,,?\".\",\".\".?\".\",!\",'.,.\",,\",.\"',.,\",.,,,,.,,\n",
852 "..,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\" .\",\".\",\".':\"!..\".'..-.\",?\",.\"'..\",,,.\",?\".\",,,\",\".\n",
853 ",,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\" ..\",,,..,,''.',,;'.\",,\".\",';'--,\".\",',!.,\",.\",,\",.\n",
854 ",,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\" ,,.,,,.',,.;.,----,,,,..\",\",\";',--..?\"\"!--!\".\"--!.\n",
855 "\"\"\"\"\"\"...\"-..,,,.,,,-,.,,,,.;,.,-,,.,,;,.\"\".\"\".\",, ..\".\"...!,...\".\",'.'..\"\",!'!...\"\",\",\".\"\",!-!?\"\"?\"\"\n",
856 "'\".\"'\"\"'\".\"''.,,.,'\"\"'.,\"\".-..,\",.,,.,,..,-.,,..,. ;,'.\"\",,!\",,,-..,.,,..\",?\",\",?!'!.\".\"',!'*,\",:\"''.\n",
857 ".\"\",-,,--\"\".,.,',..\",\".\".\"\"\"\"\"\"\"\",\"\"\"\"\",.\".,,.\"\"\"\" !'\"*,!\",\".\".\"\"?,\",:\",...,,?!\".\",.!,.\",.,,,..\",'--,\n",
858 ",,.,,,.\"\"\"\"..?-,.,.,,,\"\"\"\"-.,.,.,.;-.-.....-.,-.,, \".,.\",\",,,,\".,,,,----,...\".,,,,:\",\",,\";,,\",.\",\".\".\n",
859 ",,.,,,.;,.-\"\"\".\".-,,.-,.\"\"\"\".,.,,.\".:\"-,\",.\"'\",.\", .\"\",\",,\"'.\"\",\".\".\"\",\".,.\"',,'\",'.\"'.'.\"\",\".\".,\",,\"\n"
864 "compare(sherlock['punctuation'], wap['punctuation'], offset=100)"
869 "execution_count": 86,
876 "output_type": "stream",
878 "..;,,',.'...,;,,.,.,,;,,',',,',,.,;',,,,.,,,,,.;,, ,,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\",.,\"\"\",\".,.,'.,,,,,\",.\n",
879 "-,',,;'.,;,,.',;':.!,,;,.,,',';;',';,,'.?,',',,;,, \"\";\",,..,,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\"\"\"\",\"\"\"\"\n",
880 ",,,.,;,--,.,,;,;,.,,',,,,.,:,?,:,..,!??,.!,,;,,!'. ?'\"\"!...,,.--,,,\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,.\n",
881 ",!'.,!'.,,,,,,,,,,,,,'!':.',:,,,,'.:,,.,,:;.,,,.', .\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".'''''''''\n",
882 "'-,,,,,.!',,',,',,,,,-.,.,.!??.:!-!'',,.:!,,,;,,'. ''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',.\n",
883 ",,'.!,'.,.!.,.!.,.,.,,.,:!:;.,':!,,'.,.-,',,',''., .\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.-\n",
884 "-,,;,.,:;!,:'.,.,:,!'!;?;;,',,.,,.,,'.';:,'.,';'', ,-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\"\n",
885 "';,',.':-;,:,.?,,.',,,-.,,;,.,,,.,,.,,.,..,..,.,,. \".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\".\n",
886 "?,?,.:,;,.:.,,.:'.!.',';.,-.,..??.,,;.':,.,.',',!' \"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,!\n",
887 "',,;,!',;;,.,.,.,.,'.,.,.,';,';,,':,,.'?,,,.,..,'; -!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\",\n",
888 ",,',.',,;.,'.,,,;;'.;-;,';,-;,.,.?,.-,-,--,-,.,-., !!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\"\n",
889 ";,,,-,,,:;,,.,.,.;.;;.'.;,-.,!?,,,,,,,,,';,:;;,,:- \"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\"\n",
890 ",'.,:';.-.;,,';;,;,,,,.,,,,;,,-.,':;,,;;-,?,,:?,'; \",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\"\n",
891 ".,-,:',;,,'.,,-;,,'',;;,,.,,!..!,.!.,:.,!?,;',,,., .',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\"\n",
892 "?,,,',,.,,,?,?,,?:,',,,,,,,,,'.,,,';,,:',,':,;'',: \"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\"\n",
893 "::,,,,::-,',,.,,,,,,,.,:.;.?,.;.:,,,',',;'-;,,,,,, .\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,.\n",
894 ",,.,,;,.?,'.,,;,,.,..,!,.,:.,.',',,-'..,,,':,'-,;' ,,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,,\n",
895 "'',,,-.':,-,',,--.;':.;.'.',:,,,,,,,:,,'.?,.,.?',. .'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''.\n",
896 "'';,,.!,.,-:,:,.??,,?.;,,,:,,,,;,,.,,?,..,,;.:,;,: ',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\"\n",
897 ",?',..,';,;;:,.:;,,.,,,,.,!.,;'.',.,:,,.?,.,.,.,-, \"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\".\n",
898 "-,:,';',:',.,::;..,..,,.,;,,;-,-,,,,.;,...,-,;,-,; ..,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\"\n",
899 ".,,:,,;,,:,,,;,,..;,-',!,;,.,,&.,!..,-;:,,,,,',.., ,,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\"\n",
900 ";,:',,,.,:,.;,,,.,;,,,.!,,,'.,;;.-,,,,.:,.,,;,,;,, ,,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\"\n",
901 ".'!,,,;!:!''!,,'.!?:,,;,,.!-,-.,.',:;.,,.,,.!?.,:. \"\"\"\"\"\"...\"-..,,,.,,,-,.,,,,.;,.,-,,.,,;,.\"\".\"\".\",,\n",
902 "!.,.,';.?:,'.,,;;,,.'??!!??,.,,,..!,.?!.,;.?!?:.!: '\".\"'\"\"'\".\"''.,,.,'\"\"'.,\"\".-..,\",.,,.,,..,-.,,..,.\n",
903 ".,:?',.;,,;,,;'''.??',',,,,,',?,,,,.:.!','..,;.,;, .\"\",-,,--\"\".,.,',..\",\".\".\"\"\"\"\"\"\"\",\"\"\"\"\",.\".,,.\"\"\"\"\n",
904 ":,,,!,,,.,,!.,!!,:,.!!'?!!!??,?!?,;,!.!:'.?,;'.,-- ,,.,,,.\"\"\"\"..?-,.,.,,,\"\"\"\"-.,.,.,.;-.-.....-.,-.,,\n",
905 ";.,,?.,,.?',.,.:.;,,;,,,,,:.,,.,:.?,.,:,!,;-,.,.,, ,,.,,,.;,.-\"\"\".\".-,,.-,.\"\"\"\".,.,,.\".:\"-,\",.\"'\",.\",\n",
906 "';,,,',',','',',',',',,:.,:,:;',.,.,,;,,.?,!;,.,., \".,.\"\",.\"\"\"\"\"\"\",.\"\"\"\",\"\"\".\"\".\"!\"\"\"\"\"\"\"\"\"\"\"\"\"\".,.,'\n",
907 "-,,;.;,,,.,:;,,..,?;,,-,;,.,.,,',.,:,;.-,?!;';,.,. \"\".-!!\",.,..\".,\".\".\",.\"!.:\"\"\",.\"\"\"\"\"\".\"\"\"\"-,.,,.-,\n",
908 ",.,,,.:,.,!,.''!?,,;,,.,-,,,,,',,'.'','.,;:,.,:;'. ,,,.,\",.\".:\".--..,.,,....,,.,,.,,...,,,,-,,.\",,..,\n",
909 "'!,.,,.!!.,!,!!',',,,,:',,,,;,,,,,,,,,,.?.,!??-,?, ,-,.\",;-.,...,.;,.,\",\",\"\"-,\",.\"??\"\"\".\"'\"\",\";\"..\"\"\"\n",
910 "!!.:;,.,:,,.-,,-,,.?,,,-,,,;,?,''?,,:';;',,,,,,.,, \"..-\".\"\".\"\"\"\".\"'\".\",\"\"..-\",,,.,.'.,.,,..-,.,,-,.,.\n",
911 ":,,--.,.,..,;,.:,'.;;:,;',,,:.-!!!!....?;,;,,,.--, \",\".\"\"\".\"\"\"\".,.,,\",-.\"\",,.\",,.,,,\"\"\".\",,,\"\"\"\",,,.,\n",
912 "-,-',,;:,,.,!!!!,:'..,:,.,?-.,,,,.-,.,,?-.-,:,--., .,.,,,.,.,.,...,,.,,\".,,,,.,,.,,,.',--,,,.-.,,,.',\n",
913 ".,-.,;.,;,,.',.';,,..,!-?.,,,,,'.-,,',;':,.,,-,,', .\",,,,,\".,,.\",-,,.\".\",,.','\"\",..,\"\",,,\"\"',,,--\"\",,\n",
914 ",;,,,,,'';,.,,;;,,.,;,,''..''-,?,',;,','.:.;.!?.;, .\"\",\"\",\"\"..'.,,-,\"..\",\".\",,\"\",\",\".'',,,.,.\"\",\".\"..\n",
915 ",,.,',,..'?'',,'.';,,.',,';,,,,.'??!,?.,!,!''.?'!! ,\":\"-:,,,..,.--,.,',,,',\"\"\".,.\",'\".\",.,,,.,,\"\",.\"\"\n",
916 ",,';','?!,,?;,,.':',,.,.,?.;,'.:.';,.?,--':-','.'- .,.\"\",,.\",;\"',.',.,;\"\"\".\",',.'.,.;.,,,\"\",?..'\"\",,\"\n",
917 ",,,,.,:-,,.:'.,;;'.,',.,.',,.,,,,'.?,!:.;;.?:,,;,. ..\".,.,'.'\"\",\"\",.,-',.,,;,.\".,,,:',.,-'''.'',''-.'\n",
918 ",?.,-!':'?,:,,,,..,,'.,.!,,,!,,?.!,,,,',','.!,.!!: ,,'.,''',,'.,.,--,,-.',.'-'.''',,'''.',,,''\",,-,.'\n",
919 ",.,,?,,;,,,.,,,:,,'!,',.,;;;:,,,';,,..,;:'',.',',. '.'',,',.,,,.-,-;,.''','-'''.',.,.,,,,,,.,,.,;'\",,\n",
920 ",.,.!:.,,;,.,,;,,.?,??',,,.'?,?:.!.'.!!','?',-,,-, ,,,.,.,.\",..,,,.-,''..-,,,,-,,;,,-.,;.,,.,,;\"\"\".\"\"\n",
921 "!?-',?,,,,,,,,,,,,.,,,;;,,;,,,.,?,':,,,..:.,,,,,,, \",.,.,.,,,.'.',''''.'.',,.,,.''.',,,',.''.'.,.''.,\n",
922 ",,,??,,,,,,?,,,'?..,,,,;;:,,.,,,.,:',.,!:,,,!!,.,. -.'','.,.,.?'\".\".'','!.,,-.'\",.,;.'',',.'',,'.',,.\n",
923 ":.,;,:,..,,.,!,?,!,,'...;,,:,!,,!,,.?,,!,,!,!,!?,; '.''''.''\"',.,,-;.,,.''.''''''''''',,,.,..'''','.'\n",
924 ".,.,:'.!,,?,'.!?!?!,???.';,:,,,!,?,;.,,;,:',.!!-!! '.;'.,''''\"\".,,-,.-'''.',-,.,',,.\",,;,.,'',.,,,-,,\n",
925 "!''?,'!,,?!?,!,!!,:,.:';,,,,'.,?,?;?.,,,:;;:.,,.!, '.\",,.,..,;.'-,,.\",.,'.,.,..,,,.,,,,,,.\",,'.,.\"\"\"\"\n"
930 "compare(shakespeare['punctuation'], sherlock['punctuation'], offset=100, max_lines=50)"
935 "execution_count": 87,
942 "output_type": "stream",
944 ",,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\",.,\"\"\",\".,.,'.,,,,,\",. ?,,:--?!,.--,,.--,?--?.--,'?..'.,!..,,.'.,:,-..--,\n",
945 "\"\";\",,..,,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\"\"\"\",\"\"\"\" .?--!.?--,.'..',....--!.,',:--...,,:--'!:.,'?,.--!\n",
946 "?'\"\"!...,,.--,,,\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,. .':?..__.,,!.._!_!....--!.'.--,.''.--,.--,,,,.'...\n",
947 ".\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".''''''''' .....--!.,!,,.,,-.,,.,,,,,,,.....--,!..?--,..--,..\n",
948 "''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',. .,.'.',.'.--,.'.--',..'...--,,...'.!...--,,!,...?.\n",
949 ".\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.- .--',.....,'.--,.!,:--.-.',.--',,?.....--!.''..,,.\n",
950 ",-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\" .'..--.'.'???''.'.:,.,!,!!,,'.'.'!'!.,,',........-\n",
951 "\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\". -,.'.--?..'.?,..--?.--,?.'.'.,.,,:--'?:--??'..??--\n",
952 "\"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,! ,,...--?.?.--,,_,'._'.--?.??.--,,'?..'.'.'.?,'.'..\n",
953 "-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\", .'.''.!.'..,,:--.--?.--,..--,!..,..,,.:--,?--',.:-\n",
954 "!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\" -.?,,..,:--',.'..:_'._.,..,.,..,,.,.':,.:...,:'.?:\n",
955 "\"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\" ,,,..:_._,:._._...,,.'.,,,,,,.,,....,.._:._!!,!.--\n",
956 "\",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\" !'.,.,',.--,,...'.--',,.--,',...--.'.,?,.--,.--?.?\n",
957 ".',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\" ?.--,.--,.'..,:_,',,!,!,'!_.,,.?,?,,,.....',.:,.--\n",
958 "\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\" ',.,,?.,.--?.--,.,'!,:--!--',,.,,.,...,.--',,...,!\n",
959 ".\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,. !,!,,.,..,,.'?,,'...--?..--,.'.--,!..:--.--!,....,\n",
960 ",,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,, '.,:--_._.--',.,,,,'?,,':--,..--,,.:--_,,_._,',_,_\n",
961 ".'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''. '._,.--',,,...,:--,,'?--,.--?.,?--,,.,,.'.--!,.?!,\n",
962 "',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\" ,:_--'.,..._..--,!--,',.,.'.--',,..--?,.,!.--,,.--\n",
963 "\"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\". ,?.--,.,...,.,.,,.,.,.,,,.,:.--,',,.--,,..--,,'.,,\n",
964 "..,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\" '.--,?.--,',.--,..,,:.','','..--?.--,?.,.--,.?--,,\n",
965 ",,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\" .,?--,.--',,.--,,''.''.--,..,.,'?--,,,,.:--?,,'?.-\n",
966 ",,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\" -,?,.,'.',.,,.--,,.,.,:--!,:--,...--',.--,,,..,.,'\n",
967 "\"\"\"\"\"\"...\"-..,,,.,,,-,.,,,,.;,.,-,,.,,;,.\"\".\"\".\",, :_--,,._:--,.'....--,,,.--,.:--,?:--.--,.:--.....'\n",
968 "'\".\"'\"\"'\".\"''.,,.,'\"\"'.,\"\".-..,\",.,,.,,..,-.,,..,. .--.':--,.--,,,..--?.,,:--','..:--.?--?..?.',.--,,\n",
969 ".\"\",-,,--\"\".,.,',..\",\".\".\"\"\"\"\"\"\"\",\"\"\"\"\",.\".,,.\"\"\"\" .--,,.'.--,,.:--''..'?..,,:--..--',.,,..,'...?,...\n",
970 ",,.,,,.\"\"\"\"..?-,.,.,,,\"\"\"\"-.,.,.,.;-.-.....-.,-.,, --',..:--,?--',,.,.,.,,:--.,,,,..:--?--,,...--,!,!\n",
971 ",,.,,,.;,.-\"\"\".\".-,,.-,.\"\"\"\".,.,,.\".:\"-,\",.\"'\",.\", :--?--,.--,.:--,.?--,,.__.--?.--,,.'..,,:--',,?--,\n"
976 "compare(sherlock['punctuation'], ulysses['punctuation'], offset=100)"
981 "execution_count": 88,
988 "output_type": "stream",
990 ",,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\",.,\"\"\",\".,.,'.,,,,,\",. .,\",\"!.\"\"?\"\"!,!__,.\"\".,,,,..\"\",.__,.-,.\"\",.\"\",,..\"\n",
991 "\"\";\",,..,,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\"\"\"\",\"\"\"\" \",.\"\"..,,,,.,__.\"\"-,..;;.\"\".;,-.__.\"\",\";\";.\"\".,__?\n",
992 "?'\"\"!...,,.--,,,\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,. ..\"\",....\"\",.\"\",.\"\",,.\"\",,,.\".,,,,--.__.,,.,.;....\n",
993 ".\",.\"\",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;,\"\".''''''''' ,;..,:\".,.\"\"__.,\",\".\"\",,\",\",..\"\"...,,.\"\",\".;\".\".,,\n",
994 "''',''''\".\",-,.--,.',--',,,\",.\"\".\"..-''..,,.,,\"',. ,.\"',,'!..\"\",\";\".\"\",\".\",?\"\"-.\"\",,\",\".;,.\"\",,,.__.\"\n",
995 ".\",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,..,,\",,..\"\".,.,,.- \",.,,;?\"\".'..__;,.;,,,,.\"..,\",!\"\"?\".\",,?__.,?,,.\",\n",
996 ",-.,,.-,,,,,.,,,,.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\".\" .\",\",\"..\"\".,\".\"__;?.;,.\";.;,,.\",.!..,!,,.\"\",,,\".;,\n",
997 "\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\"\",.,..\"\",\"\".,,.\"\";\". ,,.\",!\",.\";,,.,,;,.,,__,..\"\"!\",\";__,'.\".',..,,,,..\n",
998 "\"\",\".\",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\"!.-!,! --,,;,-,...,,,,,.!;.'.\",\".,\",.\"..',.,;.,,.;.,..,,,\n",
999 "-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\", ,...;,.;..,,--.--.,,,..-;,,.,.--,.,;.,,,,,.,.,,;;,\n",
1000 "!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\" ;,,..;,,,..!..,,,..,,..,.,,;,..,,.\",,\",\"...\"\".,..,\n",
1001 "\"\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\" .\"\",\".,\"!,;.\"\"__,\".,.\"!!,,..\"\"?\",,:\",__;.,.\"...;.,\n",
1002 "\",\"\"\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",-,,.-,\",.\"-'\" ,;,,.....,.,.'.;,.,,,,...;.';.\"!.,\",\",..,.;.,!__,;\n",
1003 ".',.'.,,,.,,,,.,,,,,,,..,-,--,,.',.,-,.,.\",\",,,.\"\" !.,.!,,;,,;.,,.,,,,__--\"\"__,\",\"!',.!\"\"!,.!...'--\".\n",
1004 "\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,.,-.,.\",,.,,,,,,\"\"\" ..,,,..\",\",\"__;,,.!,,!!,,-..\",,.,.\",\",\",-,;!--,!\"\"\n",
1005 ".\",'..-,.,,,.,.,.,,,,..,..-,.,,.\"...,?,,?,.,.,'.,. ,\",\",..\"\"..\"\"?..__,__.?..,,..\"\"!\"\"!,,....\"\";.\"\";__\n",
1006 ",,\"\"\".\",.,,,-.,,.\",-,,,..,.,,'','&',..'\",,-,,.'.,, .__,!--.--',--.',,?.\"\"--..,;.\",;;,,.;,,.,,,,,,.;'.\n",
1007 ".'.,',''\",.,.,.'.',''-,.\".',...,...,,,.''.''.!'''. .,,..,;,,,.;,,--.,,..,.,----,,.,.,,,,.',,.,.,.,,,,\n",
1008 "',,,,''\"-,,,,,,.,,.,.,,.,-\"\"\";\"\"\",.,.,,,,.''..,\"\"\" -,..,..;;,;;,,.,,,,.,..--,,.,..,,..,;,,,,,,,.,,;,.\n",
1009 "\"\",.\",.,,-\"\"\"\"'\"\"\"\"\"\"\"\",\"\"\"\"\"\"\"\"..\",\",...,,,.\"\"\"\". ,,,.'.,...,,,-,'.;.\"__,,\".-.\"__.'.\"\";.\"\"!,,.__--__\n",
1010 "..,.\"\"\"\"....-.\"\"\"\",\"\"\"\"--,,,.\"\"\"\"\",-.\"'-,-..,.,.\"\" ------..\"\".;?.',,__?:'!,;.'\"\"!,----,,,.\"\"____,,\".\"\n",
1011 ",,,,,\"\"\"\"\"\".,,\"-.,,,,...,,.,,.,.,,.',.,,.,-,-,-.\"\" .,?--!--__.\"\"'-,,..--.\"\",'?--?\".\"..\"\"--,;.\"\",\",\",.\n",
1012 ",,\".-..,.,\"\",\"\"..'...,,\"\",\"\"...,.?,.,..\"\"\"\"!\"\"\"\"\"\" __.\"\",.,..;,.,.\"\".,\",\".\"\",,\",\"__,.\"\",',__.\"\",\",\"__\n",
1013 "\"\"\"\"\"\"...\"-..,,,.,,,-,.,,,,.;,.,-,,.,,;,.\"\".\"\".\",, ,.,,,,.,__.\"\",\",\"__,__.\"\",\",,\",.,;,-,.,..,.\"\".,\",,\n",
1014 "'\".\"'\"\"'\".\"''.,,.,'\"\"'.,\"\".-..,\",.,,.,,..,-.,,..,. \".,.\"\",\".;\",.\";,...'.;,,__.,,,,;,,'.,____,;,,,..\",\n",
1015 ".\"\",-,,--\"\".,.,',..\",\".\".\"\"\"\"\"\"\"\",\"\"\"\"\",.\".,,.\"\"\"\" \",\";.,;.,.__--;.__.;,.\"\",.,,,.\"\",,'.\"\",,.\"\",.,,;,,\n",
1016 ",,.,,,.\"\"\"\"..?-,.,.,,,\"\"\"\"-.,.,.,.;-.-.....-.,-.,, .-.,.\"\",\",\",,,.';.,..;,..\"\".__,;--.\"\";-;,.\"\",\",\";-\n",
1017 ",,.,,,.;,.-\"\"\".\".-,,.-,.\"\"\"\".,.,,.\".:\"-,\",.\"'\",.\", ,..,.;.\"\",;.,.\".',..;;,.,..,;,.;,.,,..',.\".,\",\"?\"\"\n"
1022 "compare(sherlock['punctuation'], pap['punctuation'], offset=100)"
1026 "cell_type": "markdown",
1029 "### Compare more than two texts at a time"
1033 "cell_type": "code",
1034 "execution_count": 89,
1040 "def compare_many(*texts, offset=0, line_len=100, gap=' ', max_lines=30):\n",
1041 " def padded_segment(text, start, length):\n",
1042 " segment = text[start:start+segment_len]\n",
1043 " segment += (' ' * (segment_len - len(segment)))\n",
1044 " return segment\n",
1045 " segment_len = line_len // len(texts) - len(gap)\n",
1046 " max_len = min(max(len(text) for text in texts), segment_len * max_lines)\n",
1047 " for i in range(offset, max_len, segment_len):\n",
1048 " segments = [padded_segment(text, i, segment_len) for text in texts]\n",
1049 " print(gap.join(segments))"
1053 "cell_type": "code",
1054 "execution_count": 90,
1061 "output_type": "stream",
1063 "..-.......'.........,,,.,,,.,.-' ,.,-..:::,[#]:,:,]::******,,.,,. ,.,-..::::,[#]:,:******,/:::::-:\n",
1064 "..,-,.,,...,-,,,,,,,,.,,,,.:,,., \".,\",\"?\"..\",\";\".,.\"..\"?\".\"__,.\". -:-:-:::::::-:-:\",,.,',----,','!\n",
1065 ",,.-,-(,.-,,,,.,,,,.,,.,,..-...; \",,,.;,,.;,.\"\"?\"\".\"\"?\"\"!,,!;.!\"\" ?--.\",,-,.,,..,,;.,.,,-,:\",(),,-\n",
1066 ",,.,,,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\" ??\"\".,\",\"!.\"\"?\"\"!,!__,.\"\".,,,,.. -.\"\"!!\",.,,,,.,,.,,,,,.\",,.',\",.\n",
1067 ",.,\"\"\",\".,.,'.,,,,,\",.\"\";\",,..,, \"\",.__,.-,.\"\",.\"\",,..\"\",.\"\"..,,, \"??\".\",?\"\"'?.,\".\".\"\"'..\"\",,\",,-,\n",
1068 ",-.,,,-,,,\".\"\",\",.\"\"\",,.\",..,\"\"\" ,.,__.\"\"-,..;;.\"\".;,-.__.\"\",\";\"; .\"'!,'?.\"\"?\",.\"?,.\",.,,.,,.,,,,,\n",
1069 "\"\"\",\"\"\"\"?'\"\"!...,,.--,,,\",--.\"\". .\"\".,__?..\"\",....\"\",.\"\",.\"\",,.\"\" ,,,.:\",'.',,,.!..!,.,!....,,?...\n",
1070 "\"\",.\"-,'\",\"...,\"\"\".\"\"\"..,..\",.\"\" ,,,.\".,,,,--.__.,,.,.;....,;..,: '..,,.?.-,.?!!,....',...!\",.\",\",\n",
1071 ",'.\".\"\"-\".\".\",\"\"\"\"\"\"\"\"\"\".\"\".\",;, \".,.\"\"__.,\",\".\"\",,\",\",..\"\"...,,. \"'..?\"\".,\",,\",,,.,..?.?\"\",\".\",\",\n",
1072 "\"\".'''''''''''',''''\".\",-,.--,.' \"\",\".;\".\".,,,.\"',,'!..\"\",\";\".\"\", ,\"?.\",..\",\",.,',.',..,,(),:\".?.\"\n",
1073 ",--',,,\",.\"\".\"..-''..,,.,,\"',..\" \".\",?\"\"-.\"\",,\",\".;,.\"\",,,.__.\"\", .\",\",--\".?',.',\".\".,'.\".\"',\".\".\"\n",
1074 ",\".\"\",.\"..',,\"\",\"\",....\"\"-\"\".,.. .,,;?\"\".'..__;,.;,,,,.\"..,\",!\"\"? \"';.?\"(),\"'....\",,..\"?\".\",.,..\",\n",
1075 ",,\",,..\"\".,.,,.-,-.,,.-,,,,,.,,, \".\",,?__.,?,,.\",.\",\",\"..\"\".,\".\"_ .\"?,\",.\"...'!\",..\"?\".\",',.,.\",,,\n",
1076 ",.\"\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\" _;?.;,.\";.;,,.\",.!..,!,,.\"\",,,\". .\",\",,\"?,\",\",?\":\"'....?\"\"..-,'.'\n",
1077 ".\"\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",.,,\"\" ;,,,.\",!\",.\";,,.,,;,.,,__,..\"\"!\" ,..;,.--'.\"\",,\",'.\"-,.'.\",',,,.\"\n",
1078 "\",.,..\"\",\"\".,,.\"\";\".\"\",\".\",-,\"\"\" ,\";__,'.\".',..,,,,..--,,;,-,..., ,\",,\"',',,.''.\"'.:.',,';.,,*.,,.\n",
1079 ",,\"..\"\",\",.\":,,-,.\"\",\".,.--.\"\".\" ,,,,.!;.'.\",\".,\",.\"..',.,;.,,.;. ',,,..*.,\",\"\"?\",,;,'.,,;.,,,\",,.\n",
1080 "!.-!,!-!-!-!,,,,\"\".-\"\"\"\"\"\"\"\"\"\"., ,..,,,,...;,.;..,,--.--.,,,..-;, \",,.-.,,,,.,-----.,,,.,,,,.,,.,,\n",
1081 "\"\"\"\",!\"\"-\"\"\"\"\"\"\"\"\"\"\"\"\"\",!!\"\"-\"\"\" ,.,.--,.,;.,,,,,.,.,,;;,;,,..;,, ,,,.\",\",.\",,,\",.\",.\"-,-,,.\",,,\".\n",
1082 "\"..\"\"\"\"\"\".\"\"\"\",\"\"....\"\"\"\"\".\"\".\"\" ,..!..,,,..,,..,.,,;,..,,.\",,\",\" \",\",,\"?.?\",,,.\"!\".,-,,-,,.,-'.,,\n",
1083 ".\",.\"\"\"\"\"\"\"\"\"\"-,...\"\"\"\"...,.,.,- ...\"\".,..,.\"\",\".,\"!,;.\"\"__,\".,.\" ..-,,,.,,,.\",,,\",.,.,.','.:\"?.\"\"\n",
1084 "\"\"\"\"\"\"\"\".\"\",\".\",.,,\"\".\"\"\"\".\"\",\"\" !!,,..\"\"?\",,:\",__;.,.\"...;.,,;,, ,,.\"\"?\"..,.,,'.\",\".,,.,,,,,,-,,,\n",
1085 "\"\"\"\"\"\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\" .....,.,.'.;,.,,,,...;.';.\"!.,\", ..,.,'.,,,.-..,.'..,,,.,,.,,,',,\n",
1086 ",-,,.-,\",.\"-'\".',.'.,,,.,,,,.,,, \",..,.;.,!__,;!.,.!,,;,,;.,,.,,, ,..-,..',,,.'.','.\",!,,\",:\",.\".,\n",
1087 ",,,,..,-,--,,.',.,-,.,.\",\",,,.\"\" ,__--\"\"__,\",\"!',.!\"\"!,.!...'--\". .\",\".\",\".\",\";,-..\",,,\",..--.,,,,\n",
1088 "\"'.,\"\"'.,,\"\";.,.'..,..,,,..-,,,. ..,,,..\",\",\"__;,,.!,,!!,,-..\",,. ,,,,----.,.,,.\"!\";.\",,\",...,,,,.\n",
1089 ",-.,.\",,.,,,,,,\"\"\".\",'..-,.,,,., ,.\",\",\",-,;!--,!\"\",\",\",..\"\"..\"\"? ,,',..\",'....,?\",.\".\",,.\",\",,.,,\n",
1090 ".,.,,,,..,..-,.,,.\"...,?,,?,.,., ..__,__.?..,,..\"\"!\"\"!,,....\"\";.\" .,.',,-,,,,-,.,,,,.\"'?\",,.\",,\",.\n",
1091 "'.,.,,\"\"\".\",.,,,-.,,.\",-,,,..,., \";__.__,!--.--',--.',,?.\"\"--..,; \",\".-.-,,,,..,,';,',,'.,.,;.\"!\".\n",
1092 ",'','&',..'\",,-,,.'.,,.'.,',''\", .\",;;,,.;,,.,,,,,,.;'..,,..,;,,, \"!\",.,,,,.,,'-,.,.\"...,\".\"----,!\n"
1097 "compare_many(sherlock['punctuation'], pap['punctuation'], wap['punctuation'])"
1101 "cell_type": "code",
1102 "execution_count": 91,
1109 "output_type": "stream",
1111 ",,,..\"\".\",,\"\"\".\",.,,.,.\"\",\"\", <> .,\",\"!.\"\"?\"\"!,!__,.\"\".,,,,..\" <> !!\",.,,,,.,,.,,,,,.\",,.',\",.\"\n",
1112 ".,\"\"\",\".,.,'.,,,,,\",.\"\";\",,.. <> \",.__,.-,.\"\",.\"\",,..\"\",.\"\".., <> ??\".\",?\"\"'?.,\".\".\"\"'..\"\",,\",,\n",
1113 ",,,-.,,,-,,,\".\"\",\",.\"\"\",,.\",. <> ,,,.,__.\"\"-,..;;.\"\".;,-.__.\"\" <> -,.\"'!,'?.\"\"?\",.\"?,.\",.,,.,,.\n",
1114 ".,\"\"\"\"\"\",\"\"\"\"?'\"\"!...,,.--,,, <> ,\";\";.\"\".,__?..\"\",....\"\",.\"\", <> ,,,,,,,,.:\",'.',,,.!..!,.,!..\n",
1115 "\",--.\"\".\"\",.\"-,'\",\"...,\"\"\".\"\" <> .\"\",,.\"\",,,.\".,,,,--.__.,,.,. <> ..,,?...'..,,.?.-,.?!!,....',\n",
1116 "\"..,..\",.\"\",'.\".\"\"-\".\".\",\"\"\"\" <> ;....,;..,:\".,.\"\"__.,\",\".\"\",, <> ...!\",.\",\",\"'..?\"\".,\",,\",,,.,\n",
1117 "\"\"\"\"\"\".\"\".\",;,\"\".'''''''''''' <> \",\",..\"\"...,,.\"\",\".;\".\".,,,.\" <> ..?.?\"\",\".\",\",,\"?.\",..\",\",.,'\n",
1118 ",''''\".\",-,.--,.',--',,,\",.\"\" <> ',,'!..\"\",\";\".\"\",\".\",?\"\"-.\"\", <> ,.',..,,(),:\".?.\".\",\",--\".?',\n",
1119 ".\"..-''..,,.,,\"',..\",\".\"\",.\". <> ,\",\".;,.\"\",,,.__.\"\",.,,;?\"\".' <> .',\".\".,'.\".\"',\".\".\"\"';.?\"(),\n",
1120 ".',,\"\",\"\",....\"\"-\"\".,..,,\",,. <> ..__;,.;,,,,.\"..,\",!\"\"?\".\",,? <> \"'....\",,..\"?\".\",.,..\",.\"?,\",\n",
1121 ".\"\".,.,,.-,-.,,.-,,,,,.,,,,.\" <> __.,?,,.\",.\",\",\"..\"\".,\".\"__;? <> .\"...'!\",..\"?\".\",',.,.\",,,.\",\n",
1122 "\".\"\",.\"\".\",.,.\"\",.,,,.,\",.\",\" <> .;,.\";.;,,.\",.!..,!,,.\"\",,,\". <> \",,\"?,\",\",?\":\"'....?\"\"..-,'.'\n",
1123 ".\"\".\"\",\";.\"\"\".\"\"\"\".\",\"\"\".\",., <> ;,,,.\",!\",.\";,,.,,;,.,,__,..\" <> ,..;,.--'.\"\",,\",'.\"-,.'.\",',,\n",
1124 ",\"\"\",.,..\"\",\"\".,,.\"\";\".\"\",\".\" <> \"!\",\";__,'.\".',..,,,,..--,,;, <> ,.\",\",,\"',',,.''.\"'.:.',,';.,\n",
1125 ",-,\"\"\",,\"..\"\",\",.\":,,-,.\"\",\". <> -,...,,,,,.!;.'.\",\".,\",.\"..', <> ,*.,,.',,,..*.,\",\"\"?\",,;,'.,,\n",
1126 ",.--.\"\".\"!.-!,!-!-!-!,,,,\"\".- <> .,;.,,.;.,..,,,,...;,.;..,,-- <> ;.,,,\",,.\",,.-.,,,,.,-----.,,\n",
1127 "\"\"\"\"\"\"\"\"\"\".,\"\"\"\",!\"\"-\"\"\"\"\"\"\"\" <> .--.,,,..-;,,.,.--,.,;.,,,,,. <> ,.,,,,.,,.,,,,,.\",\",.\",,,\",.\"\n",
1128 "\"\"\"\"\"\",!!\"\"-\"\"\"\"..\"\"\"\"\"\".\"\"\"\" <> ,.,,;;,;,,..;,,,..!..,,,..,,. <> ,.\"-,-,,.\",,,\".\",\",,\"?.?\",,,.\n",
1129 ",\"\"....\"\"\"\"\".\"\".\"\".\",.\"\"\"\"\"\"\" <> .,.,,;,..,,.\",,\",\"...\"\".,..,. <> \"!\".,-,,-,,.,-'.,,..-,,,.,,,.\n",
1130 "\"\"\"-,...\"\"\"\"...,.,.,-\"\"\"\"\"\"\"\" <> \"\",\".,\"!,;.\"\"__,\".,.\"!!,,..\"\" <> \",,,\",.,.,.','.:\"?.\"\",,.\"\"?\".\n",
1131 ".\"\",\".\",.,,\"\".\"\"\"\".\"\",\"\"\"\"\"\"\" <> ?\",,:\",__;.,.\"...;.,,;,,..... <> .,.,,'.\",\".,,.,,,,,,-,,,..,.,\n",
1132 "\"\".\"\".-.\"'\".\",,.'\".\"\".\"\"\"\"\",- <> ,.,.'.;,.,,,,...;.';.\"!.,\",\", <> '.,,,.-..,.'..,,,.,,.,,,',,,.\n",
1133 ",,.-,\",.\"-'\".',.'.,,,.,,,,.,, <> ..,.;.,!__,;!.,.!,,;,,;.,,.,, <> .-,..',,,.'.','.\",!,,\",:\",.\".\n",
1134 ",,,,,..,-,--,,.',.,-,.,.\",\",, <> ,,__--\"\"__,\",\"!',.!\"\"!,.!...' <> ,.\",\".\",\".\",\";,-..\",,,\",..--.\n",
1135 ",.\"\"\"'.,\"\"'.,,\"\";.,.'..,..,,, <> --\"...,,,..\",\",\"__;,,.!,,!!,, <> ,,,,,,,,----.,.,,.\"!\";.\",,\",.\n",
1136 "..-,,,.,-.,.\",,.,,,,,,\"\"\".\",' <> -..\",,.,.\",\",\",-,;!--,!\"\",\",\" <> ..,,,,.,,',..\",'....,?\",.\".\",\n",
1137 "..-,.,,,.,.,.,,,,..,..-,.,,.\" <> ,..\"\"..\"\"?..__,__.?..,,..\"\"!\" <> ,.\",\",,.,,.,.',,-,,,,-,.,,,,.\n"
1142 "compare_many(sherlock['punctuation'], pap['punctuation'], wap['punctuation'], gap=' <> ', offset=100)"
1146 "cell_type": "markdown",
1151 "## Making images\n",
1152 "The text versions are fine, but let's turn the punctuation into images, with a coloured square for each punctuation character."
1156 "cell_type": "markdown",
1159 "Start with just trying to get something out"
1163 "cell_type": "code",
1164 "execution_count": 102,
1170 "# Periods and question marks and exclamation marks are red. \n",
1171 "# Commas and quotation marks are green. \n",
1172 "# Semicolons and colons are blue. \n",
1173 "colours = {'.': (255, 0, 0), '?': (255, 0, 0), '!': (255, 0, 0),\n",
1174 " ',': (0, 255, 0), '\"': (0, 255, 0), \"'\": (0, 255, 0),\n",
1175 " ':': (0, 0, 255), ';': (0, 0, 255),\n",
1176 " 'unknown': (128, 128, 128)}\n",
1180 "text = sherlock['punctuation']\n",
1181 "img = Image.new('RGBA', (max_x, max_y))\n",
1182 "draw = ImageDraw.Draw(img)\n",
1186 "# for i in range(100):\n",
1187 "# if text[i] in colours:\n",
1188 "# this_colour = colours[text[i]]\n",
1190 " if p in colours:\n",
1191 " this_colour = colours[p]\n",
1193 " this_colour = colours['unknown']\n",
1194 " draw.rectangle((x, y, x+block_size, y+block_size), fill=this_colour)\n",
1195 " x += block_size\n",
1196 " if x >= max_x:\n",
1198 " y += block_size\n",
1199 "img.save('test.png')"
1203 "cell_type": "markdown",
1207 "![alt text](test.png)"
1211 "cell_type": "markdown",
1214 "Rearrange the colours to match the \"heatmaps\" in [the original](https://medium.com/@neuroecology/punctuation-in-novels-8f316d542ec4#.qwj8e1n8m), and wrap the whole thing in a function."
1218 "cell_type": "code",
1219 "execution_count": 93,
1225 "# Periods and question marks and exclamation marks are red. \n",
1226 "# Commas and quotation marks are -green- blue. \n",
1227 "# Semicolons and colons are -blue- green. \n",
1228 "def make_image(text, block_size=4, width=1000, colours=None):\n",
1229 " default_colours = {'.': (255, 0, 0), '?': (255, 0, 0), '!': (255, 0, 0),\n",
1230 " ',': (0, 0, 255), '\"': (0, 0, 255), \"'\": (0, 0, 255),\n",
1231 " ':': (0, 255, 0), ';': (0, 255, 0),\n",
1232 " 'unknown': (128, 128, 128)}\n",
1233 " if not colours:\n",
1235 " use_colours = default_colours.copy()\n",
1236 " use_colours.update(colours)\n",
1237 " height = ceil((len(text) * block_size) / width)\n",
1238 " img = Image.new('RGBA', (width, height))\n",
1239 " draw = ImageDraw.Draw(img)\n",
1242 " for p in text:\n",
1243 " if p in use_colours:\n",
1244 " this_colour = use_colours[p]\n",
1246 " this_colour = use_colours['unknown']\n",
1247 " draw.rectangle((x, y, x+block_size, y+block_size), fill=this_colour)\n",
1248 " x += block_size\n",
1249 " if x >= width:\n",
1251 " y += block_size\n",
1256 "cell_type": "code",
1257 "execution_count": 94,
1263 "i = make_image(sherlock['punctuation'])\n",
1264 "i.save('sherlock.png')"
1268 "cell_type": "code",
1269 "execution_count": 95,
1275 "i = make_image(wap['punctuation'], block_size=6, colours={'-': (255,255,255)})\n",
1280 "cell_type": "code",
1281 "execution_count": 96,
1287 "i = make_image(wap['punctuation'], colours={'-': (255,255,255), '(': (255, 165, 0), ')': (255, 165, 0)})\n",
1292 "cell_type": "code",
1293 "execution_count": 97,
1299 "i = make_image(shakespeare['punctuation'])\n",
1300 "i.save('shakespeare.png')"
1304 "cell_type": "code",
1305 "execution_count": 98,
1311 "i = make_image(ulysses['punctuation'], colours={'-': (255,255,255), '(': (255, 165, 0), ')': (255, 165, 0)})\n",
1312 "i.save('ulysses.png')"
1316 "cell_type": "code",
1317 "execution_count": 99,
1323 "i = make_image(pap['punctuation'])\n",
1328 "cell_type": "markdown",
1334 "![alt text](sherlock.png)\n",
1337 "![alt text](wap.png)\n",
1340 "![alt text](shakespeare.png)\n",
1343 "![alt text](ulysses.png)\n",
1345 "Pride and Prejudice:\n",
1346 "![alt text](pap.png)"
1350 "cell_type": "code",
1351 "execution_count": null,
1361 "display_name": "Python 3",
1362 "language": "python",
1366 "codemirror_mode": {
1370 "file_extension": ".py",
1371 "mimetype": "text/x-python",
1373 "nbconvert_exporter": "python",
1374 "pygments_lexer": "ipython3",