Cryptography Reference
In-Depth Information
0.0633, 0.0906, 0.0276, 0.0098, 0.0236, 0.0015, 0.0197, 0.0007]
elif language = es then
[0.1174, 0.0117, 0.0436, 0.0474, 0.1389, 0.0073, 0.0096, 0.0098, 0.0713,
0.0031, 0.00024, 0.0535, 0.0294, 0.0713, 0.001, 0.0922, 0.0265, 0.0115,
0.0624, 0.0781, 0.0453, 0.043, 0.0109, 0.00006, 0.0021, 0.0085, 0.0033]
end if
end proc:
Exercise 1.12 Use literary works in text format, taken from the Internet, to compute
with Maple the frequency distribution of different languages.
The following procedure computes the index of coincidence of a language from
the frequency list:
> Ind := proc(language)
add(iˆ2, i in freqlist(language)))
end proc:
In particular, we see that the indices of coincidence of English and Spanish,
computed from the frequency distributions given above, are the following:
> Ind(en);
0.06552226
> Ind(es);
0.0749428912
Of course, these indices are only approximations and one may obtain slightly dif-
ferent values by using different texts as sources. For example, the IC of 'DonQuijote',
in the version available at the 'Project Gutenberg' web page, [161] with 1
,
640
,
590
characters, is just 0
.
0746771. Other sources give the index of coincidence of Spanish
as 0
067. We will see that approximations given in the
Maple code above are precise enough to allow for an accurate cryptanalysis of the
Vigenère cipher.
The difference between the value corresponding to a random text and that char-
acteristic of the language can be exploited to find the length of the key used in a
Vigenère encryption. Observe that, in a monoalphabetic cipher, the ICs of the plain-
text and the corresponding ciphertext are exactly the same and hence the IC of the
ciphertext should be close to the one corresponding to the language used in the
plaintext. In contrast with this, the ciphertext obtained from applying a polyalpha-
betic cipher such as Vigenère's should have an IC which is closer to that of a random
text over the alphabet (the longer the key the closer). This remark can be quantita-
tively analyzed in order to derive a formula that gives the key length in terms of the
IC of the ciphertext along with the IC of the plaintext language and the length of the
corresponding alphabet (which are known assuming that the plaintext language is
known as postulated by Kerckhoffs' principle). However, these formulas are not very
precise, especially when the key word has repeated characters and, for this reason,
we will sketch a different method (also based on the above remarks) to determine
the key length, which is much more precise in practice. To illustrate it, suppose that
we have the following ciphertext, corresponding to an English plaintext:
.
078 and that of English as 0
.
> c := "oltmhvurseaebwzchltyqwncnhouyqquvwaibcnwgwvofmrplupaujpoxlogautayzkmxzevec\
wbbsvmifdfucgmmshtijtyngepyyqohasfontvnwtrivwzcbepyatbsfejbiwmnavbioikqwlmmsttlw\
nfkfpjlmvuwzakqnztxqqugsejjlhimsobmlifdmtnclyevwsfrnhimsirmwvfdmqiodvarvgirygwok\
nylstylupaujcnwoeinxtb":
Search WWH ::




Custom Search