Historical Cryptosystems - Everyday Cryptography

Cryptography Reference

In-Depth Information

Table 2.1: Approximate letter frequencies for the English language [29]

Letter

Frequency

Letter

Frequency

Letter

Frequency

A

8.167

B

1.492

C

2.782

D

4.253

E

12.702

F

2.228

G

2.015

H

6.094

I

6.966

J

0.153

K

0.772

L

4.025

M

2.406

N

6.749

O

7.507

P

1.929

Q

0.095

R

5.987

S

6.327

T

9.056

U

2.758

V

0.978

W

2.360

X

0.150

Y

1.974

Z

0.074

key the same plaintext letter is always encrypted to the same ciphertext letter .

Cryptosystems where the encryption algorithm has this property are usually

referred to as being monoalphabetic ciphers . Given a ciphertext, suppose that we:

• Know that it has been encrypted using a monoalphabetic cipher, in our case

the Simple Substitution Cipher. This is reasonable, since we normally assume

knowledge of the encryption algorithm used (see Section 1.5.1).

• Know the language in which the plaintext is expressed. This is reasonable, since

in most contexts we either know this or could at least guess.

The following strategy then presents itself:

1. Count the frequency of occurrence of letters in the ciphertext, which can be

represented as a histogram.

2. Compare the ciphertext letter frequencies with the letter frequencies of the

underlying plaintext language.

3. Make an informed guess that the most commonly occurring ciphertext letter

represents the most commonly occurring plaintext letter. Repeat with the

second most commonly occurring ciphertext letter, etc.

4. Look for patterns and try to guess words. If no progress is made then refine the

previous guesses and try again.

As an example, consider the ciphertext letter histogram generated in Figure 2.3

for an English plaintext using an unknown Simple Substitution Cipher key. It

would be reasonable from this histogram to guess that:

• ciphertext H represents plaintext E;

• ciphertext W represents plaintext T;

Everyday Cryptography

Search WWH ::

Custom Search

Home