Cryptography Reference
In-Depth Information
0
1
0
1
1
0
A
S
W
01
T
01
*
E
Figure 19.1: An imperfect modification of the Huffman tree from
Figure 5.1. The extra synchronization character, '*', is not unique.
19.3 Extending Other Tools
Section 5.2.1 describes Huffman coding , a compression algorithm
that will produce variable-length codes for characters and compress
the bit stream by giving shorter codes to more common letters and
longer codes to the rarer ones. The standard algorithm may be very
good at squeezing out the extra entropy from a file, but it is extremely
fragile. If only one bit is lost, the rest of the stream can be compro-
mised.
Adding a synchronization character to a Huffman code is not as
easy as creating another character and adding it the mix. Figure 19.1
is a modified version of Figure 5.1 created by adding another charac-
ter, '*', to the alphabet to act as a synchronizer. This seems to work.
Adding the asterisk at the beginning of a word or a paragraph would
be a signal. The word 'ate' would be encoded as '1111011101110'.
Anyone who needs to synchronize a stream of bits could look for the
four 1s in a row and then start forward from that point.
This would seem to work. There are four 1s in the asterisk's code
block, '1111', and because of the shape of the tree, four 1s only occur
in this asterisk. But there are three problems:
The rest of the tree is not guaranteed to be as short as it is in this
example. This can be guaranteed by assigning the synchroniza-
tion character the lowest probability, something that ensures it
will be the longest code in the tree.
Two letters together can produce the synchronization code.
Search WWH ::




Custom Search