Cryptography Reference
In-Depth Information
Several people have approached a similar problem called gener-
ating a travesty . This was addressed in a series of Byte magazine arti-
cles [KO84, Way85] that described how to generate statistically equiv-
alent text. The articles didn't use the effect to hide data, but they
did concentrate on the most efficient way to generate it. This work
ends up being quite similar in practice to the homophonic ciphers
described by H. N. Jendal, Y. J. B. Kuhn, and J. L. Massey in [JKM90]
and generalized by C. G. Gunther in [Gun88].
Here are four different approaches to storing the statistical tables
needed to generate the data:
n boxes where
Giant Array Allocate an array with
c
c
is the number
of possible characters at each position and
n
is the order of the
statistics being kept. Obviously
can be as low as 27 if only
capital letters and spaces are kept. But it can also be 256 if all
possible values of a byte are stored. This may be practical for
small values of
c
n
, but it quickly grows impossible if there are
k
letters produced.
Giant List Create an alphabetical list of all of the entries. There is
one counter per node as well as a pointer and a string holding
the value in question. This makes the nodes substantially less
efficient than the array. This can still pay off if there are many
nodes that are kept out. If English text is being mimicked, there
are many combinations of several letters that don't occur. A list
is definitely more efficient.
Giant Tree Build a big tree that contains one path from the root to
a leaf for each letter combination found in the tree. This can
contain substantially more pointers, but it is faster to use than
the Giant List. Figure 6.3 illustrates an implementation of this.
Going Fishing Randomize the search. There is no statistical table
produced at all because
are too large. The source file
serves as a random source and it is consulted at random for
each choice of a new letter. This can be extremely slow, but
it may be the only choice if memory isn't available.
c
and
n
The first three solutions are fairly easy to implement for anyone
with a standard programming background. The array is the easiest.
The list is not hard. Anyone implementing the tree has a number of
choices. Figure 6.3 shows that the new branches at each level are
stored in a list. This could also be done in a binary tree to speed
lookup.
Search WWH ::




Custom Search