Databases Reference
In-Depth Information
5
4
3
2
1
0
0
2
4
6
8
10
12
Block size n
FIGURE 2.1
H n in bits per letter for
n =
1
, ... ,
12 for
Wealth of Nations
.
For most sources, Equations ( 2 ) and ( 4 ) are not identical. If we need to distinguish between
the two, we will call the quantity computed in ( 4 )the first-order entropy of the source, while
the quantity in ( 2 ) will be referred to as the entropy of the source.
In general, it is not possible to know the entropy for a physical source, sowe have to estimate
the entropy. The estimate of the entropy depends on our assumptions about the structure of
the source sequence.
Consider the following sequence:
12323454567898910
Assuming the frequency of occurrence of each number is reflected accurately in the number of
times it appears in the sequence, we can estimate the probability of occurrence of each symbol
as follows:
1
16
P
(
1
) =
P
(
6
) =
P
(
7
) =
P
(
10
) =
2
16
Assuming the sequence is iid , the entropy for this sequence is the same as the first-order entropy
defined in ( 4 ). The entropy can then be calculated as
P
(
2
) =
P
(
3
) =
P
(
4
) =
P
(
5
) =
P
(
8
) =
P
(
9
) =
10
H
=−
P
(
i
)
log 2 P
(
i
).
i
=
1
Search WWH ::




Custom Search