Cryptography Reference
In-Depth Information
Table 2.2: Letter frequency analysis: theory versus practice
Theory
Number of ciphertext letters
Practice
Many plaintexts
Less than 5
Many plaintexts
Reducing to 1
Between 5 and 27
Hard to find
1
28
Hard to find
1
Between 29 and 200
Getting easier to find
1
More than 200
Easy to find
theory suggests that the Simple Substitution Cipher is not fit for use if a plaintext
is anywhere close to 28 letters long (which effectively makes it useless for most
applications). However, practice suggests that if it was used to encrypt a 50-letter
plaintext then we might well 'get away with it'. In this case, this gap arises from
the fact that theory only tells us that something exists, not how to find it.
2.2 Historical advances
In Section 2.1 we observed that all monoalphabetic ciphers can be broken using
letter frequency analysis. We now look at a number of more sophisticated
historical cryptosystems. These cryptosystems use various different techniques to
defeat single letter frequency analysis. It is these techniques that we are particularly
interested in, since they illustrate good cryptosystem design principles. Despite
this, none of the cryptosystems presented here are appropriate for use in modern
applications for reasons that we will indicate.
2.2.1 Design improvements
Before proceeding, it is worth reflecting on design features that could be built into
a cryptosystem in order to make single letter frequency analysis harder to conduct
or, better still, ineffective. Three possible approaches are:
1. Increase the size of the plaintext and ciphertext alphabets . The cryptosystems
that we have looked at thus far all operate on single letters. We can describe
this by saying that the plaintext (and ciphertext ) alphabet is the set of all single
letters. When we conduct letter frequency analysis, we are only trying to match
26 possible ciphertext letters to 26 possible plaintext letters, which is generally
not a particularly difficult task. If there were a larger number of choices for each
unit of plaintext (ciphertext) then frequency analysis would certainly be harder.
 
 
 
Search WWH ::




Custom Search