Mathematical Preliminaries for Lossless Compression - Introduction to Data Compression

Databases Reference

In-Depth Information

the explanation is interesting, it is not really necessary for understanding much of what we

will study in this topic and can be skipped.

2.2.1 Derivation of Average Information

We start with the properties we want in our measure of average information. We will then show

that requiring these properties in the information measure leads inexorably to the particular

definition of average information, or entropy, that we have provided earlier.

Given a set of independent events A 1 ,

A 2 ,...,

A n with probability p i

(

A i )

, we desire

the following properties in the measure of average information H :

1. We want H to be a continuous function of the probabilities p i . That is, a small change in

p i should only cause a small change in the average information.

2. If all events are equally likely, that is, p i

n for all i , then H should be a monotonically

increasing function of n . The more possible outcomes there are, the more information

should be contained in the occurrence of any particular outcome.

3. Suppose we divide the possible outcomes into a number of groups. We indicate the

occurrence of a particular event by first indicating the group it belongs to, then indicating

which particular member of the group it is. Thus, we get some information first by

knowing which group the event belongs to; and then we get additional information

by learning which particular event (from the events in the group) has occurred. The

information associated with indicating the outcome in multiple stages should not be any

different than the information associated with indicating the outcome in a single stage.

For example, suppose we have an experiment with three outcomes, A 1 ,

A 2 , and A 3 , with

corresponding probabilities, p 1 ,

p 2 , and p 3 . The average information associated with

this experiment is simply a function of the probabilities:

(

p 1 ,

p 2 ,

p 3 )

Let's group the three outcomes into two groups:

B 1 ={

A 1 } ,

B 2 ={

A 2 ,

A 3 }

The probabilities of the events B i are given by

q 1 =

(

B 1 ) =

p 1 ,

q 2 =

(

B 2 ) =

p 2 +

p 3

If we indicate the occurrence of an event A i by first declaring which group the event be-

longs to and then declaring which event occurred, the total amount of average information

would be given by

q 1 H p 1

q 1

q 2 H p 2

p 3

q 2

(

q 1 ,

q 2 ) +

q 2 ,

We require that the average information computed either way be the same.

Introduction to Data Compression

Search WWH ::

Custom Search

Home