Cryptography Reference
In-Depth Information
over the noisy channel. In the example given earlier, we have seen a very primitive
form of coding (i.e., the source bit 0 is assigned a sequence of zeros, whereas
the source bit 1 is assigned a sequence of ones). In either case, the code words
are transmitted over the noisy channel and received by a decoder, which attempts
to determine the original source signals. In general, to achieve reliability without
sacrificing speed of transmission in digital communications, code words must not
be assigned to single bits or bytes but instead to longer bit blocks. In other words,
the encoder waits for the source to generate a block of bits of a specified length
and then assigns a code word to the entire block. The decoder, in turn, examines
the received sequence and makes a decision as to the identity of the original source
bits. In practice, encoding and decoding are much more involved than this simple
example may suggest.
In order to make these ideas more concrete, we need a mathematical measure
for the information conveyed by a message, or—more generally—a measure of
information. This is where the notion of entropy comes into play.
5.2
ENTROPY
Let (Ω , Pr) be a discrete probability space (or random experiment, respectively)
and X :Ω
and uniformly
distributed elements. If we have no prior knowledge about X and try to guess the
correct value of X , then we have a probability of 1 /
→X
a random variable with range
X
=
{
1 , 2 ,..., 5
}
=1 / 5 of being correct. If,
however, we have some prior knowledge and know, for example, that 1
|
|
2,
then we have a higher probability of correctly guessing X (i.e., 1 / 2 in this example).
In other words, there is less uncertainty about the second situation, and knowing that
1
X
2 has in fact reduced the uncertainty about the value of X . It thus appears
that if we could pin down the notion of uncertainty, we would also be able to measure
precisely the transfer of information.
Suppose that a discrete random experiment involves the observation of a
random variable X ,andlet X take on a finite number of possible values x 1 ,...x n .
The probability that X takes on x i ( i =1 ,...,n )isPr[ X = x i ]= P X ( x i ) and
is abbreviated as p i (note that all p i
X
0 and i =1 p i =1). Our goal is to come
up with a measure for the uncertainty associated with X . To achieve this goal, we
construct the following two functions:
First, we define the function h on the interval [0 , 1].Thevalue h ( p ) can
be interpreted as the uncertainty associated with an event that occurs with
probability p .Iftheevent( X = x i ) has probability p i , then we say that
h ( p i ) is the uncertainty associated with the event ( X = x i ) or the uncertainty
removed (or information conveyed) by revealing that X has taken on value x i .
Search WWH ::




Custom Search