8.5. Determining the Key Word#

We will continue to use the same ciphertext from the previous section, stored to the string ciphertext:

ZQQTK PQUWD PGMWD BQTXY LFQWL SHAJB UCIPV KUQEJ RBAAC LRSIZ ZCRWT LDFMT PGYXF ISOSE ASZXN PHTAY HHIIR ADDIJ LBFOE VKUWW VFFLV TCEXG HFFXF ZVGXF BFQEI ZOSEZ UGFGF UJUGK PCZWZ UQQJI VAFLV CSDCX YOPYR SQTEI HQFII VTAYI LRGGR AWARN LAGWK JCZXZ UIMPC FTAVX LHMRU LAMRT PDMXV VIDWV SJQWW YCYOE VKXIU NSBVV CWAYJ SMMGH BWDIU DSYYJ AGQXR ZWPIF SRZSK PCZWR URQQS YOOIW YSELF USEEE KOEAV SSMVE DSYYJ APQHR PZKYE SSMVE PBSWF TSFLZ UUILZ JVUXY HGOSJ AIERF ZAMPC SONSL YOZHR ULUIK FHAET XIUVV HBPXY PGPMW MWOYC AMMXK HQTIJ PHEIC MAAVV JZAWV SMFSR UOSIZ UKTMT ODDSX YSEWY HGSEZ USPEJ AFARX HGOIE KSZGP VJQVG YSVYU PQQEE KWZAY PQTTV YGARJ HBPXY PBSWR YSPEP IMPEP MWZHZ UUFLV PFDIR SZQZV SWZPZ LIAJK OSUVT VBHIE AWARR SJMPL LHTIJ HAQTI PBOMG SSEAY PQTLR CSEAV WHMAR FHDEU PHUSE HZMFL ZSEEE KKTMT OODID HYURX YOBMU OOHST HAARX AVQVV CSZYV ZCRWZ USOYI PGFWR UREXI PDBME NHTIK OWZXR DRDCM LWXJI VAMXK YOOXZ CSEYG LFEXZ AWARJ HFQAF YYURX HGMGK PJQPP PBXMK LFMXL YSMWZ UGAGZ LHKXY LQDIU BZUXP VTARV DFUXV YCDXY LDMVK POXMK FCREE VHTII MWZHJ HGBSN LFRYC HHAYT OGFSE LOZHR ZKTSC LGAQV HQTEJ AWEID LBFME AVQLV HZFLP ZQQTK PQUWD VTMXV TDQVR ASOPR ZGAJR UHMKF UWEXJ HGFLV KFQED ZCRGF UGQVM HHUWD VFFLV PABSJ AIDIJ VTBPL YOXMJ AGURV JIDIJ PBFLV JVGVT OVUWK VFKEE KHDEU PHUSE DVQXY LFAJR UQUIE ACDGF TDMVR AWHIC FFQGV UHFMD LGMVV ZINNV JHQHK VJQVP KWRJV YSZXY HBPPZ UURVF THTEK DVUGY AVQME KIXKV UQQSI JFQHL SWFCF MTAVD LFMKV ZQAYC KOXPF DAQVV ZHMXV TSZXJ HFQNV HZAYJ SMIEK JVQHR URFLV TCFMM LGAJK OSIVZ ASDJF YAMWZ TDAVK HBFEE PBSVV KWQRK PBFLV HBMPP ZWESW OWELZ ZHAVP HGFLV MOOXJ OSDIT VFPWG YCNES PZUXP PGMTF DSDJL SOZHK YCGFC LGAQV ASEXR URUXZ ZPKXY PGFVF BPXIJ VAQWK HBPEI KHTEK HZMVX LDAVK PCZSW OWEXF YWOEC LJUHV UQQMJ ZWRXV KQARJ PGFIE JMUWE VZQWJ WSDXZ UOOMF BGMRU LLMGK PBSME PHEHV TOZHJ PBNVZ LTFSN YWFIR OWEXF YMIID BGFOE VKYSI LHTEE TSDIW HQFWY BAMRE HHGVV CWQAV KIZHV YOZME KIOXZ VBAJV EHQRU LRQBG LFUIE JSUWK OSNIJ AVQPG ACFLV JFUXZ JWEQF MVGQR UVUWK VFKLZ ZHAVZ JOXGY HFMGK LFEGR UCZPP ISQWK PAMXV KPKXY LGFEE KODHN OWOLY BAMRV EDQVZ LBOIN OSFLV YOOXL HZAVK YOPMK PCZEI FVMWW BFZMJ OSPXF MCDQT VFDIT AJUIN ZCRME KWHMU BOXWN LAGWK YSSEI KHTID HGRSI TWZKG HFFWF MOSVV HHILF SSIID BGFQV HGGVV AVQQS FHTIZ YFQPR AWARK VHTID HGESW ISURX ZPKAY VAFLV FODIJ BFDSL URQHR URURT VBFID WZMXZ UUFLV PBOMU LBFWZ UHTIZ YZUZV ZCDGF URUXZ VBILZ JVFVR KWFMF UVMWY HBPIU KCIRK VIEAV TIEXI HHTII JCZWZ KSDXY LUQRV YOXFV HFURX VTFLV DVAPV UODVR AWHIK OOZXY LFQWG LQFMM LDDSS HPUPZ AMAJZ AGPIK HWXW

Using Chi-Squared to Determine the Keyword#

Once you’ve determined the most likely length of the keyword, which we believe to be \(5\), you can split the ciphertext up into the same groups that were created when calculating the average index of coincidence in the previous section. Then, you can determine which key value (0-26) will result in the minimum chi-squared score for each group. This key value is equivalent to the key letter in the keyword.

Group 1#

Let’s first look at the details for a single group.

Creating the first group as in the previous section:

ZPPBLSUKRLZLPIAPHALVVTHZBZUUPUVCYSHVLALJUFLLPVSYVNCSBDAZSPUYYUKSDAPSPTUJHAZSYUFXHPMAHPMJSUUOYHUAHKVYPKPYHPYIMUPSSLOVASLHPSPCWFPHZKOHYOHACZUPUPNODLVYCLAHYHPPLYULLBVDYLPFVMHLHOLZLHALAHZPVTAZUUHKZUHVPAVYAJPJOVKPDLUATAFULZJVKYHUTDAKUJSMLZKDZTHHSJUTLOAYTHPKPHZOZHMOVYPPDSYLAUZPBVHKHLPOYLUZKPJVWUBLPPTPLYOYBVLTHBHCKYKVELLJOAAJJMUVZJHLUIPKLKOBELOYHYPFBOMVAZKBLYKHTHMHSBHAFYAVHIZVFBUUVWUPLUYZUVJKUHKVTHJKLYHVDUAOLLLHAAH

We can then investigate which all the possible key-values / characters for the first position in the keyword and their associated chi-squared scores:

 0 (A):  2746.2793
 1 (B):  5532.9047
 2 (C):  5046.4354
 3 (D):  2307.0057
 4 (E):  4271.2940
 5 (F):  3694.3021
 6 (G):  3376.3487
 7 (H):    36.9091
 8 (I):  8548.5127
 9 (J):  2520.3679
10 (K):  6779.9744
11 (L):  4475.3529
12 (M):  7798.8582
13 (N):  1727.0775
14 (O):  3390.7807
15 (P):  3106.8933
16 (Q):  6755.2973
17 (R):  6788.1546
18 (S):  3208.2398
19 (T):  2076.8860
20 (U):  2423.7105
21 (V):  9278.0497
22 (W):  3638.0067
23 (X):  2934.9926
24 (Y):  5237.7531
25 (Z):  7475.5848

You can see that the key value of \(7\), which corresponds to the letter of H, is the most likely candidate for the first character in the keyword since it’s associated with the lowest chi-squared score. This implies that Group 1 has ciphertext characters that were most likely created by a 5-letter keyword whose first character is the letter H.

Continuing the Process#

We can use this same method to determine the remaining letters in the keyword:

Group: 1	 Min Score: 36.90910	 Key value: 7	 Key Letter: H
Group: 2	 Min Score: 41.31034	 Key value: 14	 Key Letter: O
Group: 3	 Min Score: 19.01105	 Key value: 12	 Key Letter: M
Group: 4	 Min Score: 36.14494	 Key value: 4	 Key Letter: E
Group: 5	 Min Score: 19.28371	 Key value: 17	 Key Letter: R
Keyword: HOMER

Now that the keyword is known, you can attempt to decipher the message to determine if the analysis was correct.

print( vigenereDecipher( ciphertext, 'HOMER' ) )
scepticismisasmuchtheresultofknowledgeasknowledgeisofscepticismtobecontentwithwhatweatpresentknowisforthemostparttoshutourearsagainstconvictionsincefromtheverygradualcharacterofoureducationwemustcontinuallyforgetandemancipateourselvesfromknowledgepreviouslyacquiredwemustsetasideoldnotionsandembracefreshonesandaswelearnwemustbedailyunlearningsomethingwhichithascostusnosmalllabourandanxietytoacquireandthisdifficultyattachesitselfmorecloselytoanageinwhichprogresshasgainedastrongascendencyoverprejudiceandinwhichpersonsandthingsaredaybydayfindingtheirreallevelinlieuoftheirconventionalvaluethesameprincipleswhichhavesweptawaytraditionalabusesandwhicharemakingrapidhavocamongtherevenuesofsinecuristsandstrippingthethintawdryveilfromattractivesuperstitionsareworkingasactivelyinliteratureasinsocietythecredulityofonewriterorthepartialityofanotherfindsaspowerfulatouchstoneandaswholesomeachastisementinthehealthyscepticismofatemperateclassofantagonistsasthedreamsofconservatismortheimposturesofpluralistsinecuresinthechurchhistoryandtraditionwhetherofancientorcomparativelyrecenttimesaresubjectedtoverydifferenthandlingfromthatwhichtheindulgenceorcredulityofformeragescouldallowmerestatementsarejealouslywatchedandthemotivesofthewriterformasimportantaningredientintheanalysisofhishistoryasthefactsherecordsprobabilityisapowerfulandtroublesometestanditisbythistroublesomestandardthatalargeportionofhistoricalevidenceissiftedconsistencyisnolesspertinaciousandexactinginitsdemandsinbrieftowriteahistorywemustknowmorethanmerefactshumannatureviewedunderaninductionofextendedexperienceisthebesthelptothecriticismofhumanhistoryhistoricalcharacterscanonlybeestimatedbythestandardwhichhumanexperiencewhetheractualortraditionaryhasfurnishedtoformcorrectviewsofindividualswemustregardthemasformingpartsofagreatwholewemustmeasurethembytheirrelationtothemassofbeingsbywhomtheyaresurroundedandincontemplatingtheincidentsintheirlivesorconditionwhichtraditionhashandeddowntouswemustratherconsiderthegeneralbearingofthewholenarrativethantherespectiveprobabilityofitsdetails

And it can be seen that this is a passage from Homer’s The Illiad.

Exercises for the Reader#

Can you write Python functions to:

  • Use chi-squared scoring and a determined keyword length to determine the keyword letters?