Cryptography Reference
In-Depth Information
the probability that two letters selected at random from
are identical. Be-
low we show how to mathematically demonstrate that the index of coincidence
for a monoalphabetic cipher is about 0
.
065, and the index of coincidence for a
polyalphabetic cipher is somewhere between 0
.
0385 and 0
.
065. For very long
keywords, the index of coincidence for polyalphabetic ciphers will be closer to
0
.
0385. Hence, by a simple analysis of intercepted ciphertext, a cryptanalyst can
relatively easily determine the type of cryptosystem being used. This was quite
a breakthrough. Moreover, his idea contained a mechanism for determining the
probable keylength, as had Kasiski. Here is how it works.
C
First we need a table of letter frequencies
for the English alphabet. This well-known,
standard table (presented here as Table 2.5)
augments Tables 1.4 and 1.5, which we pre-
sented on pages 44 and 45, when we dis-
cussed letter frequencies in Section 1.4.
Now suppose that
n
stands for the num-
ber of letters in a ciphertext,
, and
n
j
stands for the number of letters in the
j
-th
position of the English alphabet. In other
words,
n
1
is the number of occurrences of
the letter
a
in
C
,
n
2
is the number of occur-
rences of the letter
b
in
C
, and so on. With-
out getting into the reasons for it, the Index
of Coincidence,
C
IC
, is given as approximately
the following.
n
1
n
2
+
n
2
n
2
+
n
26
n
2
Figure 2.14: Elizabeth S. Fried-
man.
IC
≈
···
+
.
for the En-
glish language from Table 2.5, and since each of the numbers in the table is a
percentage, then we divide each by 100, and get:
So if we want to compute
IC
(0
.
8167)
2
+(0
.
01492)
2
+
IC
≈
+(0
.
00074)
2
=0
.
065, which explains the aforementioned Index of Coinci-
dence for monoalphabetic ciphers, since the frequency is invariant. (Note that
the symbol
···
means “approximately equal to”. It is not a strict equality but
this is good enough since we are dealing with a statistical analysis wherein
approximations are good enough for our investigations.)
≈
Relative Letter Frequencies for English
Table 2.5
a b c d e f g h i
8
.
167 1
.
492 2
.
782 4
.
253 12
.
702 2
.
228 2
.
015 6
.
094 6
.
966
j
k
l
m
n
o
p
q
r
0
.
153 0
.
772 4
.
025 2
.
406
6
.
749
7
.
507 1
.
929 0
.
095 5
.
987
s
t
u
v
w
x
y
z
6
.
327 9
.
056 2
.
758 0
.
978
2
.
360
0
.
150 1
.
974 0
.
074
Search WWH ::
Custom Search