Utilities - Data Structures and Problem Solving Using Java

Java Reference

In-Depth Information

For instance, suppose that the encoded string is 010011110001011000

1000111. Figure 12.3 shows that 0 and 01 are not character codes but that 010

represents i, so the first character is i. Then 011 follows, which is an s. Then

11 follows, which is a newline ( nl ). The remainder of the code is a, sp, t, i, e,

and nl .

The character codes can be different lengths, as long as no character code

is a prefix of another character code, an encoding called a prefix code. Con-

versely, if a character is contained in a nonleaf node, guaranteeing unambigu-

ous decoding is no longer possible.

Thus our basic problem is to find the full binary tree of minimum cost (as

defined previously) in which all characters are contained in the leaves. The

tree shown in Figure 12.4 is optimal for our sample alphabet. As shown in

Figure 12.5, this code requires only 146 bits. There are many optimal codes,

which can be obtained by swapping children in the encoding tree.

In a prefix code , no

character code is a

prefix of another

character code.

This is guaranteed

if the characters

are only in leaves. A

prefix code can

be decoded

unambiguously.

figure 12.4

An optimal prefix

code tree

i

sp

e

a

t

s

nl

figure 12.5

Optimal prefix code

Character

Code

Frequency

Total Bits

a

001

10

30

e

01

15

30

i

10

12

24

s

00000

3

15

t

0001

4

16

sp

11

13

26

nl

00001

1

5

Total

146

Data Structures and Problem Solving Using Java

Search WWH ::

Custom Search

Home