Basic Mimicry - Disappearing Cryptography

Cryptography Reference

In-Depth Information

it is not surprising that the five letters 'ompre' and 'press' also occur

84 times.

The text is generated in a process guided by these statistics. The

text begins by selecting one group of five letters at random. In the

Figure, the first five letters are “The l”. Then it uses the statistics to

dictate which letters can follow. In the draft of Chapter 5, the five

letters 'he la' occur 2 times, the letters 'he le' occur 16 times and the

letters “he lo” occur 2 times. If the fifth-order text is going to mimic

the statistical profile of Chapter 5, then there should be a 2 out of

20 chance that the letter “a” should follow the random “The l”. Of

course, there should also be a 16 out of 20 chance that it should be a

“e” and a 2 out of 20 chance that it should be an “o”.

This process is repeated ad infinitum until enough text is gener-

ated. It is often amazing just how real the result sounds. To a large

extent, this is caused by the smaller size of the sample text. If you as-

sume that there are about 64 printable characters in a text file, then

there are about 64 5 different combinations of five letters. Obviously,

many of them like “zqTuV” never occur in the English language, but a

large number of themmust make their way into the table if the algo-

rithm is to have many choices. In the last example, there were three

possible choices for a letter to follow “The l”. The phrase “The let-

ter” is common in Chapter 5, but the phrase “The listerine” is not.

In many cases, there is only one possible choice that was dictated by

the small number of words used in the sample. This is what gives it

such a real sounding pattern.

Here's the algorithm for generating

n

th -order text called

T

given a

source text

S

:

n

S

1. Construct a list of all combinations of

letters that occur in

S

and keep track of howmany times each of these occurs in the

.

2. Choose one at random to be a seed. This will be the first

n

letters of

T

.

3. Repeat this loop until enough text is generated:

.

(b) Search through the statistical table and find all combina-

tions of letters that begin with these

(a) Take the last

n − l

letters of

T

1 letters.

(c) The last letters of these combinations is the set of possible

choices for the next letter to be added to

n −

.

(d) Choose among these letters and use the frequency of their

occurrences in

T

S

to weight your choice.

(e) Add it to

T

.

Disappearing Cryptography

Search WWH ::

Custom Search

Home