Information Technology Reference
In-Depth Information
then the
joint entropy
of two casual variables
X
,
Y
is given by:
∑
H
(
X
,
Y
)=
−
p
(
x
,
y
)
log
p
(
x
,
y
)
X
=
x
,
Y
=
y
the
entropy
of
X
conditioned
to
Y
is given by the following equation where
p
(
x
|
y
)
is the conditional probability of having
X
=
x
when
Y
=
y
:
∑
(
|
)=
−
(
,
)
(
|
)
H
X
Y
p
x
y
log
p
x
y
X
=
x
,
Y
=
y
it can be shown that:
∑
H
(
X
|
Y
)=
−
p
(
y
)
H
(
X
|
Y
=
y
)
X
=
x
,
Y
=
y
and
H
(
X
,
Y
)=
H
(
X
)+
H
(
Y
|
X
)
.
Other two important and related notions of information theory are
mutual infor-
mation
I
between
X
and
Y
and the
entropic divergence
between them.The
mutual information (usually defined by means of the entropic divergence) satisfies
the following important equation:
(
X
|
Y
)
I
(
X
|
Y
)=
H
(
X
)
−
H
(
X
|
Y
)
the equation above explains the name used for denoting
I
. In fact, let us iden-
tify the entropy of an information source with the average (probabilistic) informa-
tion of the data that it emits. if we consider
X
and
Y
as two processes, where the
first one is an information source sending data through a channel, and the second is
the information source of the data as they are received, then
I
(
X
|
Y
)
is a (probabilis-
tic) measure of the information passing from the sender to the receiver: the average
information of
X
(the transmitted information) minus the average information of
X
when
Y
(the received information) is given.
The entropic concepts can be easily extended from discrete to continuous vari-
ables. This extension is necessary to the analysis of transmission by means of sig-
nals. They are functions, periodical in time, realized by electromagnetic waves. In
this case, data are encoded by altering a wave in a suitable way (modulating it),
and when the wave is received its alterations are decoded (demodulated) for recov-
ering the encoded messages. This method mixes discrete methods for representing
information with continuous signals playing the role of channels. Shannon shows
the
sampling theorem
, extending an already known result, according to which the
capacity of a periodical function is related to its maximum frequency (in its compo-
sition as sum of circular functions, in Fourier representation). By using the sampling
theorem Shannon shows his celebrated third theorem giving the capacity of a contin-
uous signal affected by a noise of power
N
. This capacity is given by the maximum
frequency of the signal multiplied by a factor which is the logarithm of the ratio
P
(
X
|
Y
)
+
N
/
N
,where
P
is the power of the signal.