Databases Reference
In-Depth Information
We seem to be striking out with continuous random variables. There is no analog for
self-information and really none for entropy either. However, the situation improves when we
look for an analog for the average mutual information. Let us define the random variable
Y
d
in
a manner similar to the random variable
X
d
, as the discretized version of a continuous valued
random variable
Y
. Then we can show (see Problem 4 at the end of this chapter)
∞
∞
f
X
|
Y
(
y
j
)
−
H
(
X
d
|
Y
d
)
=−
x
i
|
y
j
)
f
Y
(
y
j
)
log
f
X
|
Y
(
x
i
|
log
(45)
i
=−∞
j
=−∞
Therefore, the average mutual information for the discretized random variables is given by
I
(
X
d
;
Y
d
)
=
H
(
X
d
)
−
H
(
X
d
|
Y
d
)
(46)
∞
=−
f
X
(
x
i
)
log
f
X
(
x
i
)
(47)
i
=−∞
⎡
⎤
∞
∞
⎣
⎦
−
f
X
|
Y
(
x
i
|
y
j
)
f
Y
(
y
j
)
log
f
X
|
Y
(
x
i
|
y
j
)
(48)
i
=−∞
j
=−∞
Notice that the two log
s in the expression for
H
(
X
d
)
and
H
(
X
d
|
Y
d
)
cancel each other out,
and as long as
h
(
X
)
and
h
(
X
|
Y
)
are not equal to infinity, when we take the limit as
→
0of
I
(
X
d
;
Y
d
)
we get
(49)
The average mutual information in the continuous case can be obtained as a limiting case of
the average mutual information for the discrete case and has the same physical significance.
We have gone through a lot of mathematics in this section. But the information will be
used immediately to define the rate distortion function for a random source.
I
(
X
;
Y
)
=
h
(
X
)
−
h
(
X
|
Y
)
8.5 Rate Distortion Theory
Rate distortion theory is concerned with the trade-offs between distortion and rate in lossy
compression schemes. Rate is defined as the average number of bits used to represent each
sample value. One way of representing the trade-offs is via a
rate distortion function R(D)
.
The rate distortion function
R
specifies the lowest rate at which the output of a source can
be encoded while keeping the distortion less than or equal to
D
. The distortion
D
is generally
taken to be the expected value of a single letter distortion measure
d
(
D
)
(
x
i
,
y
j
)
, which is a measure
of the difference between
x
i
and
y
j
:
N
−
1
M
−
1
D
=
E
[
d
(
X
,
Y
)
]=
d
(
x
i
,
y
j
)
P
(
x
i
,
y
j
)
i
=
0
j
=
0
On our way to mathematically defining the rate distortion function, let us look at the rate and
distortion for some different lossy compression schemes.