Cryptography Reference
In-Depth Information
mayalso be viewed as the average number of bits needed for recording output
from
L
. The rate of
L
for messages of length n is defined by
r n (
L
)= H (
L
) /n,
and the rate of
L
is defined to be
r (
L
) = lim
n
r n (
L
) .
→∞
This is the average number of bits of entropyper letter.
The absolute rate of
L
with a k -letter alphabet is given by
R (
L
) = log 2 ( k ) ,
which is the maximum number of bits per letter in a string from
L
. Thus, the
redundancy of
L
is defined by
D (
L
)= R (
L
)
r (
L
) ,
and the redundancy rate is
D (
L
) /R (
L
) .
Redundancy in languages supplies an important cryptanalytic tool in terms of
recovering plaintext or keys from ciphertext.
Example 11.8 Consider
L
to be English. Shannon was able to demonstrate
that, for English,
L
1 . 5 ,
which means that the average information content of the English language is
between 1 and 1 . 5 bits per letter. Moreover, since
1
r (
)
R (
L
) = log 2 (26)
4 . 7
bits per letter, then
D (
L
)= R (
L
)
r (
L
)
4 . 7
1 . 25
3 . 5
so the redundancy of English is roughly 3.5 bits per letter. Looking at this
another way, the redundancy rate is roughly
D (
L
) /R (
L
)
3 . 5 / 4 . 7
75% .
In other words, 3 / 4 of the English language is redundant. This does not mean
that you can discard three quarters of a given English text and still be able to
read it. It does mean that there is a Huffman encoding of length n English text
that will, for su ? ciently large values of n , compress it to about a quarter of what
it was ( see inequality (11.7)) .
Search WWH ::




Custom Search