Databases Reference
In-Depth Information
experiment. If P
is the probability that the event A will occur, then the self-information
associated with A is given by
(
A
)
1
i
(
A
) =
log b
) =−
log b P
(
A
)
(1)
P
(
A
Note that we have not specified the base b of the log function. We will discuss the choice
of the base later in this section. The use of the logarithm to obtain a measure of information
was not an arbitrary choice as we shall see in Section 2.2.1. But first let's see if the use of a
logarithm in this context makes sense from an intuitive point of view. Recall that log(1)
=
0,
and
increases as x decreases from one to zero. Therefore, if the probability of an
event is low, the amount of self-information associated with it is high; if the probability of an
event is high, the information associated with it is low. Even if we ignore the mathematical
definition of information and simply use the definition we use in everyday language, this makes
some intuitive sense. The barking of a dog during a burglary is a high-probability event and,
therefore, does not contain too much information. However, if the dog did not bark during a
burglary, this is a low-probability event and contains a lot of information. (Obviously, Sherlock
Holmes understood information theory!) 1 Although this equivalence of the mathematical and
semantic definitions of information holds true most of the time, it does not hold all of the
time. For example, a totally random string of letters will contain more information (in the
mathematical sense) than a well-thought-out treatise on information theory.
Another property of this mathematical definition of information that makes intuitive sense
is that the information obtained from the occurrence of two independent events is the sum of
the information obtained from the occurrence of the individual events. Suppose A and B are
two independent events. The self-information associated with the occurrence of both event A
and event B is, by Equation ( 1 ),
log
(
x
)
1
i
(
AB
) =
log b
P
(
AB
)
as A and B are independent,
P
(
AB
) =
P
(
A
)
P
(
B
)
and
1
i
(
AB
) =
log b
P
(
A
)
P
(
B
)
1
1
=
log b
) +
log b
P
(
A
P
(
B
)
=
i
(
A
) +
i
(
B
)
The unit of information depends on the base of the log. If we use log base 2, the unit is bits ;if
we use log base e , the unit is nats ; and if we use log base 10, the unit is hartleys . In general,
if we do not explicitly specify the base of the log we will be assuming a base of 2.
1 Silver Blaze by Arthur Conan Doyle.
 
Search WWH ::




Custom Search