Cryptography Reference
In-Depth Information
mayalso be viewed as the average number of bits needed for recording output
from
L
. The
rate of
L
for messages of length
n
is defined by
r
n
(
L
)=
H
(
L
)
/n,
and the
rate of
L
is defined to be
r
(
L
) = lim
n
r
n
(
L
)
.
→∞
This is the average number of bits of entropyper letter.
The
absolute rate of
L
with a
k
-letter alphabet is given by
R
(
L
) = log
2
(
k
)
,
which is the maximum number of bits per letter in a string from
L
. Thus, the
redundancy of
L
is defined by
D
(
L
)=
R
(
L
)
−
r
(
L
)
,
and the
redundancy rate
is
D
(
L
)
/R
(
L
)
.
Redundancy in languages supplies an important cryptanalytic tool in terms of
recovering plaintext or keys from ciphertext.
Example 11.8
Consider
L
to be English. Shannon was able to demonstrate
that, for English,
≤
L
≤
1
.
5
,
which means that the average information content of the English language is
between
1
and
1
.
5
bits per letter. Moreover, since
1
r
(
)
R
(
L
) = log
2
(26)
≈
4
.
7
bits per letter, then
D
(
L
)=
R
(
L
)
−
r
(
L
)
≈
4
.
7
−
1
.
25
≈
3
.
5
so the redundancy of English is roughly 3.5 bits per letter. Looking at this
another way, the redundancy rate is roughly
D
(
L
)
/R
(
L
)
≈
3
.
5
/
4
.
7
≈
75%
.
In other words,
3
/
4
of the English language is redundant. This does not mean
that you can discard three quarters of a given English text and still be able to
read it. It does mean that there is a Huffman encoding of length
n
English text
that will, for su
?
ciently large values of
n
, compress it to about a quarter of what
it was
(
see inequality
(11.7))
.
Search WWH ::
Custom Search