Basic Mimicry - Disappearing Cryptography

Cryptography Reference

In-Depth Information

1

0

a

0

1

c

b

Figure 6.4: An ordinary Huffman tree built for three characters, “a”,

“b”, and “c” that occur with probabilities of 50%, 37.5%, and 12.5%

respectively.

Figure 6.5 shows a new version of the Huffman tree designed to

balance the distribution. There are now two extra layers added to the

tree. The branching choices made in these extra two layers would

use extra bits supplied by a pseudo-random generator. When they

were recovered, these bits would be discarded. It should be easy to

establish that “b” will emerge 37.5%of the time and “c” will be output

12.5% of the time if the data being hidden is perfectly distributed.

The cost of this process is efficiency. The new tree may produce

output with the right distribution, but decoding is often not possible.

The letter “b” is produced from the leaves with addresses 100, 101,

and 110. Since only the first bit remains constant with the tree in

Figure 6.5 then only one bit can be hidden with the letter “b”. The

other two bits would be produced by the pseudo-random bit stream

and not recovered at the other end. The tree in Figure 6.4 would hide

two bits with the letter “b”, but it would produce a “b” 25% of the

time. This is the trade-off of efficiency versus accuracy.

How many bits are hidden or encoded if a “c” is output? It could

either be three that are encoded when a 111 is found in the input file

or it could be one bit padded in the same manner as the letter “b”.

Either choice is fine.

This technique can be extended significantly to support any

amount of precision. The most important step is to make sure that

there will be no ambiguity in the decoding process. If the same char-

acter exists on both branches, then no bit can be encoded using any

of the subtree descending from this point.

Thismeansthatitisnotpossibletoencodedatawhichisdom-

Search WWH ::

Custom Search

Home