Languages and Grammars - Infobiotics: Information in Biotic Systems

Information Technology Reference

In-Depth Information

Fig. 6.9 A Huffmann code tree

Shannon's first theorem shows that entropy is a lower bound to the code optimality.

In fact, if H is the entropy of an information source and L C is the mean length of C

with respect to this information source, then:

L C ≥

H

.

Let T be a text, that is a sequence

α 1 α 2 ... α m of codewords, and let us denote by

| α |

the length of a codeword

α

. The digital size of T (with respect to the code of its

codewords) is given by:

)= ∑

j

digit C (

T

m | α

|.

j

=

1

,

A compression of T is a text T such that, for some code C :i) digit C (

T ) <

and ii) text T can be univocally recovered from T .The compression

ratio is digit C (

digit C (

T

)

T ) /

. This ratio together with the space and time complex-

ity of a compression algorithm are the basic elements to evaluate a compression

method.

It is easy to realize that no universal compression algorithm can exist. In fact,

no compression method can satisfy the two conditions above for every text. If this

universal method would exist, then by applying a compression, every text of a cer-

tain digital size q could be reduced to a text of digital size q <

digit C (

T

)

q . But of course

the number of texts of size q are less than the number of texts of size q ,there-

fore it is impossible to recover, univocally, every q -text from some q -text. The

Infobiotics: Information in Biotic Systems

Search WWH ::

Custom Search

Home