Information Technology Reference
In-Depth Information
Fig. 6.9 A Huffmann code tree
Shannon's first theorem shows that entropy is a lower bound to the code optimality.
In fact, if H is the entropy of an information source and L C is the mean length of C
with respect to this information source, then:
L C
H
.
Let T be a text, that is a sequence
α 1 α 2 ... α m of codewords, and let us denote by
| α |
the length of a codeword
α
. The digital size of T (with respect to the code of its
codewords) is given by:
)=
j
digit C (
T
m | α
|.
j
=
1
,
A compression of T is a text T such that, for some code C :i) digit C (
T ) <
and ii) text T can be univocally recovered from T .The compression
ratio is digit C (
digit C (
T
)
T ) /
. This ratio together with the space and time complex-
ity of a compression algorithm are the basic elements to evaluate a compression
method.
It is easy to realize that no universal compression algorithm can exist. In fact,
no compression method can satisfy the two conditions above for every text. If this
universal method would exist, then by applying a compression, every text of a cer-
tain digital size q could be reduced to a text of digital size q <
digit C (
T
)
q . But of course
the number of texts of size q are less than the number of texts of size q ,there-
fore it is impossible to recover, univocally, every q -text from some q -text. The
Search WWH ::




Custom Search