Cryptography Reference
In-Depth Information
Original source
Original source
Original source
Source reduction
Source reduction
Source reduction
Symbol
Z
Q
A
X
F
S
Symbol
Z
Q
A
X
F
S
Symbol
Z
Q
A
X
F
S
Probability
0.01
0.02
0.05
0.07
0.38
0.47
Probability
0.01
0.02
0.05
0.07
0.38
0.47
Probability
0.01
0.02
0.05
0.07
0.38
0.47
0.03
0.05
0.07
0.38
0.47
0.03
0.05
0.07
0.38
0.47
0.03
0.05
0.07
0.38
0.47
0.07
0.08
0.38
0.47
0.07
0.08
0.38
0.47
0.07
0.08
0.38
0.47
0.15
0.38
0.47
0.15
0.38
0.47
0.15
0.38
0.47
0.47
0.53
0.47
0.53
0.47
0.53
Original source
Original source
Code assignment
Code assignment
Symbol
Z
Q
A
X
F
S
Symbol
Z
Q
A
X
F
S
Probability
0.01 10000
0.02 10001
0.05 1011
0.07 100
0.38 11
0.47 0
Probability
0.01 10000
0.02 10001
0.05 1011
0.07 100
0.38 11
0.47 0
0.03 1000
0.05 1011
0.07 100
0.38 11
0.47 0
0.03 1000
0.05 1011
0.07 100
0.38 11
0.47 0
0.07 100
0.08 101
0.38 11
0.47 0
0.07 100
0.08 101
0.38 11
0.47 0
0.15 10
0.38 11
0.47 0
0.15 10
0.38 11
0.47 0
0.47 0
0.53 1
0.47 0
0.53 1
Fig. 2.4. Example of Huffman coding: source reduction and code assignment.
Thus we have achieved a compression ratio of 3/1.79 = 1.676 without
any information loss. It should be pointed out that the e ciency of Huffman
coding depends on the source probability distribution. If all the symbols in a
source have equal occurrence, the Huffman coding will not achieve any data
reduction.
2.2.3 Arithmetic Coding
The problem with Huffman coding lies in the fact that Huffman codes have to
be an integral number of bits. For example, if the probability of a character
is 1/3, the optimum number of bits to code that character is assumed to be
around 1.6. The Huffman coding scheme has to assign either 1 or 2 bits to the
code, and either choice leads to a longer bit length than the ideal ones.
Arithmetic coding was invented by Rissanen [7]. While Huffman coding
gives a way of rounding the code words to close integer values and construct-
ing a code with those lengths, arithmetic coding actually manages to encode
symbols using non-integer numbers of bits.
In arithmetic coding, an input message of any length is represented as a
real number R in the range 0≤R<1. The longer the message, the more
precise is required for the number R. Assume that a signal source has M
symbols, its code construction process is listed as follows:
1) Divide the interval [0, 1) into segments corresponding to the M symbols;
the segment of each symbol has a length proportional to its probability.
2) Choose the segment of the first symbol in the string message.
Search WWH ::




Custom Search