Digital Signal Processing Reference
In-Depth Information
HUFFMAN algorithm produces a "code-tree" which results from the probability
of the individual symbols. The code tree is used to establish the code words for the
individual symbols.
The process is described in detail in Illustration 224.
Decoding:
in order that the receiver can recognise the original data from the byte sequence
the HUFFMAN tree must also be transmitted. When "descending from above" you
always land on one of the 7 different symbols selected in Illustration 224. Then
you have to jump back up to the top and branch out to the left and right until the
next signal is reached.
Note :
The longer for instance the text is, the less room is taken up by the transmission in
addition of the HUFFMAN tree.
LZW encoding
This process is named after its originators LEMPEL, ZIV and (later) WELCH. It is
probably the most commonly used process for all-purpose compression. Thus it is used in
the ZIP compression of files and of many graphic formats (e.g. GIF). Compression factors
of 5:1 are usual.
Illustration 225 shows top left the content of the string of numbers before and after each
step in the encoding process. In the first step the longest pattern found will only be a single
letter which is to be found in a standard dictionary. In our example this is "L". In the same
step the next letter to be examined is "Z" and is attached to the L.
The chain of signs which is produced in this way is definitely not in the dictionary and is
entered for the first time under the Index (256). After this in the last step the "longest"
chain of signs found - i.e. the "L" - is removed and is issued at the same time (see
"Recognised pattern"). In this way "Z" becomes the first sign of the new string.
Here everything begins again from the beginning. "Z" is now the longest known pattern.
"ZW" is stored under the Index (257) in the lexicon. "Z" is now removed, issued and a
new round begins. "W" is now the longest chain of signs entered in the lexicon so far.
"WL" is now included under the index number (258), the "W" is deleted from the entry
and issued.
It is only now that something interesting happens with regard to the desired compression.
The longest known pattern is now a sequence of signs which was previously entered in
the lexicon for the first time ("LZ" with the index number 256). In the next step, the index
of the pattern from the lexicon is issued and not just two individual signs.
As the dictionary contains 4096 (= 212 ) the entries get longer and longer in the case of a
very long suitable entry string (this is the file which is to be compressed). More and more
often longer sequences of signs for which short indexes are transmitted belong to the
higher indexes.
Search WWH ::




Custom Search