Arithmetic Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

the symbol a 3 . Suppose the second symbol in the sequence was a 2 . The tag value would then

be restricted to lie in the interval [0.49, 0.56). We would then partition this interval in the same

proportion as the original interval to obtain the subintervals [0.49, 0.539) corresponding to

the symbol a 1 , [

)

corresponding to the symbol a 2 , and [0.546, 0.56) correspond-

ing to the symbol a 3 . If the third symbol was a 3 , the tag would be restricted to the interval

[0.546, 0.56), which could then be subdivided further. This process is described graphically

in Figure 4.1 .

Notice that the appearance of each new symbol restricts the tag to a subinterval that is

disjoint from any other subinterval that may have been generated using this process. For the

sequence beginning with a 1 ,

539

546

, by the time the third symbol a 3 is received, the tag

has been restricted to the subinterval [0.546, 0.56). If the third symbol had been a 1 instead

of a 3 , the tag would have resided in the subinterval [0.49, 0.539), which is disjoint from the

subinterval [0.546, 0.56). Even if the two sequences are identical from this point on (one

starting with a 1 ,

a 2 ,

a 3 ,...

a 2 ,

a 3 and the other beginning with a 1 ,

a 2 ,

a 1 ), the tag interval for the two

sequences will always be disjoint.

As we can see, the interval in which the tag for a particular sequence resides is disjoint

from all intervals in which the tag for any other sequence may reside. As such, any member

of this interval can be used as a tag. One popular choice is the lower limit of the interval;

another possibility is the midpoint of the interval. For the moment, let's use the midpoint of

the interval as the tag.

In order to see how the tag generation procedure works mathematically, we start with

sequences of length one. Suppose we have a source that puts out symbols from so m e alphabet

A ={

a 1 ,

a 2 ,...,

a m }

. We can map the symbols

{

a i }

to real numbers

{

}

. Define T X (

a i )

i − 1

2 P

T X (

a i ) =

1 P

(

) +

(

)

(2)

2 P

F X (

−

) +

(

)

(3)

For each a i ,

T X (

a i )

will have a unique value. This value can be used as a unique tag for a i .

Example4.3.2:

Consider a simple dice-throwing experiment with a fair die. The outcomes of a roll of the die

can be mapped into the numbers

{

,...,

}

. For a fair die

(

) =

for k

,...,

Therefore, using ( 3 ) we can find the tag for X

2as

2 P

6 +

12 =

T X (

) =

(

) +

(

) =

Introduction to Data Compression

Search WWH ::

Custom Search

Home