Databases Reference
In-Depth Information
We have already shown that
T
X
(
x
)
l
(
x
)
>
F
X
(
x
−
1
)
. Therefore, all we need to do is
show that
1
2
l
(
x
)
F
X
(
x
)
−
T
X
(
x
)
l
(
x
)
>
This is true because
F
X
(
x
)
−
T
X
(
x
)
l
(
x
)
>
F
X
(
x
)
−
T
X
(
x
)
P
(
x
)
=
2
1
2
l
(
x
)
This code is prefix free; and by taking the binary representation of
T
X
(
>
x
)
and truncating it to
1
l
1 bits, we obtain a uniquely decodable code.
Although the code is uniquely decodable, how efficient is it? We have shown that the
number of bits
l
(
x
)
=
log
P
(
x
)
+
(
x
)
required to represent
F
X
(
x
)
with enough accuracy such that the code for
different values of
x
is distinct is
log
1
(
)
=
+
l
x
1
(
)
P
x
Remember that
l
is the number of bits required to encode the
entire
sequence
x
. So, the
average length of an arithmetic code for a sequence of length
m
is given by
(
x
)
P
l
A
(
m
)
=
(
x
)
l
(
x
)
(14)
log
1
P
1
=
(
)
+
(15)
x
P
(
x
)
log
1
P
1
<
(
x
)
)
+
1
+
(16)
(
P
x
P
2
P
=−
(
x
)
log
P
(
x
)
+
(
x
)
(17)
X
m
=
(
)
+
(18)
H
2
Given that the average length is always greater than the entropy, the bounds on
l
A
(
m
)
are
H
X
(
m
)
)
X
(
m
)
)
+
(
l
A
(
m
)
<
H
(
2
.
The length per symbol,
l
A
, or rate of the arithmetic code is
l
A
(
m
)
m
. Therefore, the bounds on
l
A
are
X
(
m
)
)
m
X
(
m
)
)
m
(
(
H
H
2
m
l
A
<
+
(19)
We have shown in Chapter 3 that for
iid
sources
H
X
(
m
)
)
=
(
mH
(
X
)
(20)
Therefore,
2
m
H
(
X
)
l
A
<
H
(
X
)
+
(21)
By increasing the length of the sequence, we can guarantee a rate as close to the entropy as we
desire.