Classical Ciphers and Their Cryptanalysis - Introduction to Cryptography with Maple

Cryptography Reference

In-Depth Information

language. Suppose now that we have a language such that the relative frequencies

of the alphabet characters are p 0 , p 1 , ..., p r − 1 . Then, starting with Eq. 1.1 above, we

have for a text of length n with frequencies f 0 , f 1 , ..., f r − 1 :

r − 1

i =

f i

0 (

n )

−

I c (

) =

(

−

)

n 2

(

−

n )

−

f i

Now, for n large, each of the relative frequencies

in the text is close to p i and

we obtain the following approximation:

f i

−

p i

I c ≈

≈

This value is the probability that two characters chosen at random in a text in

the given language are the same. We may apply a similar method to compute the

IC of the language consisting of random texts over a given alphabet. For such a

text (a text over the given alphabet, whose characters are randomly generated with

uniform probability distribution), the IC has the following value:

r 1

I c ≈

We see that, as was to be expected, in this case the IC depends only on the size

of the alphabet. For example, it is approximately 0

0384 for the 26-letter English

alphabet and 0

037 for the 27-letter Spanish alphabet.

Exercise 1.11 Use the functions Generate and AddFlavor from Maple's

RandomTools package to generate pseudo-random text strings of specified length

over a given alphabet. Compute the IC of these text strings using the function Ic (or

IndexOfCoincidence ) and check that they are close to the expected value for

a random string over the considered alphabet.

As we observed before, the IC increases when the frequency distribution of char-

acters is more uneven (that is, when the language in question has more redun-

dancy) and so the IC of a natural language with an alphabet of r symbols is

larger than

r . We can use the above formula to compute the indices of coinci-

dence of English and Spanish from their frequency distributions. The frequency

distribution of the languages we are considering is given in Maple by the follow-

ing procedure. The list of frequencies in English is taken from [130] and that for

Spanish was compiled from a collection of texts of various kinds which included

around 850,000 characters.

> freqlist := proc(language)

if language = en then

[0.0817, 0.0149, 0.0278, 0.0425, 0.1270, 0.0223, 0.0202, 0.0609, 0.0697,

0.0015, 0.0077, 0.0403, 0.0241, 0.0675, 0.0751, 0.0193, 0.0010, 0.0599,

Introduction to Cryptography with Maple

Search WWH ::

Custom Search

Home