Mathematical Preliminaries for Lossless Compression - Introduction to Data Compression

Databases Reference

In-Depth Information

experiment. If P

is the probability that the event A will occur, then the self-information

associated with A is given by

(

)

(

) =

log b

) =−

log b P

(

)

(1)

(

Note that we have not specified the base b of the log function. We will discuss the choice

of the base later in this section. The use of the logarithm to obtain a measure of information

was not an arbitrary choice as we shall see in Section 2.2.1. But first let's see if the use of a

logarithm in this context makes sense from an intuitive point of view. Recall that log(1)

and

increases as x decreases from one to zero. Therefore, if the probability of an

event is low, the amount of self-information associated with it is high; if the probability of an

event is high, the information associated with it is low. Even if we ignore the mathematical

definition of information and simply use the definition we use in everyday language, this makes

some intuitive sense. The barking of a dog during a burglary is a high-probability event and,

therefore, does not contain too much information. However, if the dog did not bark during a

burglary, this is a low-probability event and contains a lot of information. (Obviously, Sherlock

Holmes understood information theory!) 1 Although this equivalence of the mathematical and

semantic definitions of information holds true most of the time, it does not hold all of the

time. For example, a totally random string of letters will contain more information (in the

mathematical sense) than a well-thought-out treatise on information theory.

Another property of this mathematical definition of information that makes intuitive sense

is that the information obtained from the occurrence of two independent events is the sum of

the information obtained from the occurrence of the individual events. Suppose A and B are

two independent events. The self-information associated with the occurrence of both event A

and event B is, by Equation ( 1 ),

−

log

(

)

(

) =

log b

(

)

as A and B are independent,

(

) =

(

)

(

)

and

(

) =

log b

(

)

(

)

log b

) +

log b

(

)

(

) +

(

)

The unit of information depends on the base of the log. If we use log base 2, the unit is bits ;if

we use log base e , the unit is nats ; and if we use log base 10, the unit is hartleys . In general,

if we do not explicitly specify the base of the log we will be assuming a base of 2.

1 Silver Blaze by Arthur Conan Doyle.

Introduction to Data Compression

Search WWH ::

Custom Search

Home