Cryptography Reference
In-Depth Information
This is a hypothesis, but it is substantiated by the fact that string '88'
occurs strikingly often in the text, and 'ee' is most frequent in the English
language. So the attacker is already looking at digrams, i.e., two-letter
pairs. That's pretty easy with the naked eye.
If '8' corresponds to 'e', then there could be a pattern composed of
three characters and one '8' in several instances at the end — namely the
correspondence for the frequent word 'the'. Such a pattern occurs seven
times. This means that we might already have recovered three characters.
Let's continue testing the hypothesis.
The cryptanalyst uses the characters recovered and guesses several words
and, by using them, recovers more letters. Step by step, but increasingly
faster, the cryptanalyst gets closer to his goal. The most important pre-
requisite is that he knows what that goal should be: the English language.
From Theory to Practice: Automatic Decryption
One would assume that text encoded in this way can be read 'online' thanks
to modern computer technology, provided one has a suitable program. Still,
you'd spend a lot of time searching the Internet for free software that breaks
substitution ciphers without human interaction. The only explanation I have is
this: the theory is clear, and a simple demonstration program for cryptanalysis
can be written quickly, though some manual work will remain in the end.
Obviously, no author has been interested in a fully automatic cryptanalysis so
far. Well, there may have been such authors, but their software has allegedly
been locked up. As ridiculous as this may sound, I have actually received
serious hints.
I felt it was about time to do away with this deplorable state and tackled the
task myself. The frequency analysis described is poorly suitable for programs,
because it requires too much understanding of the context and too much text.
So my idea was to test for 'forbidden' pairs rather than for particularly fre-
quent ones (which corresponds to negative pattern search, as we will see in
Section 3.4.1). The frequency of single characters should serve only to set up
an initial substitution scheme. In general, several forbidden pairs will result
from the decoding attempt. Optimizing things by slightly varying the substitu-
tion from one step to the next should then allow us to continually reduce the
forbidden pairs.
So much about a cute theory. However, experiments resulted in catastrophic
findings; my idea was simply unusable. (National intelligence agencies are
likely to know more about such statistical niceties than cryptologists!)
Search WWH ::




Custom Search