Databases Reference
In-Depth Information
The Huffman code can be described in terms of a binary tree similar to the ones shown in
Figure 3.4 . The squares denote the external nodes or leaves and correspond to the symbols in
the source alphabet. The codeword for a symbol can be obtained by traversing the tree from
the root to the leaf corresponding to the symbol, where 0 corresponds to a left branch and 1
corresponds to a right branch. In order to describe how the adaptive Huffman code works,
we add two other parameters to the binary tree: the weight of each leaf, which is written as a
number inside the node, and a node number . The weight of each external node is simply the
number of times the symbol corresponding to the leaf has been encountered. The weight of
each internal node is the sum of the weights of its offspring. The node number y i is a unique
number assigned to each internal and external node. If we have an alphabet of size n , then
the 2 n
1 internal and external nodes can be numbered as y 1 ,...,
y 2 n 1 such that if x j is the
weight of node y j ,wehave x 1
x 2 n 1 . Furthermore, the nodes y 2 j 1 and y 2 j
are offspring of the same parent node, or siblings, for 1
x 2 ···
n , and the node number for the
parent node is greater than y 2 j 1 and y 2 j . These last two characteristics are called the sibling
property , and any tree that possesses this property is a Huffman tree [ 21 ].
In the adaptive Huffman coding procedure, neither transmitter nor receiver knows anything
about the statistics of the source sequence at the start of transmission. The tree at both the
transmitter and the receiver consists of a single node that corresponds to all symbols not yet
transmitted (NYT) and has a weight of zero. As transmission progresses, nodes corresponding
to symbols transmitted are added to the tree, and the tree is reconfigured using an update
procedure. Before the beginning of transmission, a fixed code for each symbol is agreed upon
between transmitter and receiver. A simple (short) code is as follows:
If the source has an alphabet
j
<
(
a 1 ,
a 2 ,...,
a m )
of size m , then pick e and r such that
2 e
2 e . The letter a k is encoded as the
m
=
+
r and 0
r
<
(
e
+
1
)
-bit binary representation
of k
1, if 1
k
2 r ;else, a k is encoded as the e -bit binary representation of k
r
1.
=
=
=
For example, suppose m
10. The symbol a 1 is encoded as 00000,
the symbol a 2 is encoded as 00001, and the symbol a 22 is encoded as 1011.
When a symbol is encountered for the first time, the code for the NYT node is transmitted,
followed by the fixed code for the symbol. A node for the symbol is then created, and the
symbol is taken out of the NYT list.
Both transmitter and receiver start with the same tree structure. The updating procedure
used by both transmitter and receiver is identical. Therefore, the encoding and decoding
processes remain synchronized.
26, then e
4, and r
3.4.1 Update Procedure
The update procedure requires that the nodes be in a fixed order. This ordering is preserved by
numbering the nodes. The largest node number is given to the root of the tree, and the smallest
number is assigned to the NYT node. The numbers from the NYT node to the root of the tree
are assigned in increasing order from left to right and from lower level to upper level. The set
of nodes with the same weight makes up a block . Figure 3.12 is a flowchart of the updating
procedure.
The function of the update procedure is to preserve the sibling property. In order that the
update procedures at the transmitter and receiver both operate with the same information, the
Search WWH ::




Custom Search