Cryptography Reference
In-Depth Information
Table 2.1: Approximate letter frequencies for the English language [29]
Letter
Frequency
Letter
Frequency
Letter
Frequency
A
8.167
B
1.492
C
2.782
D
4.253
E
12.702
F
2.228
G
2.015
H
6.094
I
6.966
J
0.153
K
0.772
L
4.025
M
2.406
N
6.749
O
7.507
P
1.929
Q
0.095
R
5.987
S
6.327
T
9.056
U
2.758
V
0.978
W
2.360
X
0.150
Y
1.974
Z
0.074
key the same plaintext letter is always encrypted to the same ciphertext letter .
Cryptosystems where the encryption algorithm has this property are usually
referred to as being monoalphabetic ciphers . Given a ciphertext, suppose that we:
• Know that it has been encrypted using a monoalphabetic cipher, in our case
the Simple Substitution Cipher. This is reasonable, since we normally assume
knowledge of the encryption algorithm used (see Section 1.5.1).
• Know the language in which the plaintext is expressed. This is reasonable, since
in most contexts we either know this or could at least guess.
The following strategy then presents itself:
1. Count the frequency of occurrence of letters in the ciphertext, which can be
represented as a histogram.
2. Compare the ciphertext letter frequencies with the letter frequencies of the
underlying plaintext language.
3. Make an informed guess that the most commonly occurring ciphertext letter
represents the most commonly occurring plaintext letter. Repeat with the
second most commonly occurring ciphertext letter, etc.
4. Look for patterns and try to guess words. If no progress is made then refine the
previous guesses and try again.
As an example, consider the ciphertext letter histogram generated in Figure 2.3
for an English plaintext using an unknown Simple Substitution Cipher key. It
would be reasonable from this histogram to guess that:
• ciphertext H represents plaintext E;
• ciphertext W represents plaintext T;
 
 
Search WWH ::




Custom Search