Huffman Coding - Introduction to Data Compression - page 47

Databases Reference

In-Depth Information

T A B L E 3 . 7

Reduced three-letter alphabet.

Letter

Probability

Codeword

a 1

0.4

α 2

a 2

0.4

c ( a 2 )

a 4

0.2

α 1

T A B L E 3 . 8

Reduced two-letter alphabet.

Letter

Probability

Codeword

a 2

0.6

α 3

a 1

0.4

α 2

T A B L E 3 . 9

Minimum variance Huffman

code.

Letter

Probability

Codeword

a 1

0.2

10

a 2

0.4

00

a 3

0.2

11

a 4

0.1

010

a 5

0.1

011

0

0

a 2

0

1

a 1

a 3

a 1

a 2

1

1

a 3

1

a 5

a 4

a 5

a 4

F I GU R E 3 . 4

Two Huffman trees corresponding to the same probabilities.

channel expects to receive 22,000 bits, no more and no less. As the bit generation rate will vary

around 22,000 bits per second, the output of the source coder is generally fed into a buffer. The

purpose of the buffer is to smooth out the variations in the bit generation rate. However, the

buffer has to be of finite size, and the greater the variance in the codewords, the more difficult

the buffer design problem becomes. Suppose that the source we are discussing generates a

string of a 4 s and a 5 s for several seconds. If we are using the first code, this means that we will

be generating bits at a rate of 40,000 bits per second. For each second, the buffer has to store

18,000 bits. On the other hand, if we use the second code, we would be generating 30,000

bits per second, and the buffer would have to store 8000 bits for every second this condition

persisted. If we have a string of a 2 s instead of a string of a 4 s and a 5 s, the first code would

result in the generation of 10,000 bits per second. Remember that the channel will still be

expecting 22,000 bits every second, so somehow we will have to make up a deficit of 12,000

Next Page

Introduction to Data Compression

Search WWH ::

Custom Search

Home