Mathematical Preliminaries for Lossy Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

We seem to be striking out with continuous random variables. There is no analog for

self-information and really none for entropy either. However, the situation improves when we

look for an analog for the average mutual information. Let us define the random variable Y d in

a manner similar to the random variable X d , as the discretized version of a continuous valued

random variable Y . Then we can show (see Problem 4 at the end of this chapter)

∞

f X | Y (

y j ) −

(

X d |

Y d ) =−

x i |

y j )

f Y (

y j )

log f X | Y (

x i |

log

(45)

=−∞

Therefore, the average mutual information for the discretized random variables is given by

(

X d ;

Y d ) =

(

X d ) −

(

X d |

Y d )

(46)

∞

=−

f X (

x i )

log f X (

x i )

(47)

=−∞

⎡

⎤

∞

⎣

⎦

−

f X | Y (

x i |

y j )

f Y (

y j )

log f X | Y (

x i |

y j )

(48)

=−∞

Notice that the two log

s in the expression for H

(

X d )

and H

(

X d |

Y d )

cancel each other out,

and as long as h

(

)

and h

(

)

are not equal to infinity, when we take the limit as

→

0of

(

X d ;

Y d )

we get

(49)

The average mutual information in the continuous case can be obtained as a limiting case of

the average mutual information for the discrete case and has the same physical significance.

We have gone through a lot of mathematics in this section. But the information will be

used immediately to define the rate distortion function for a random source.

(

;

) =

(

) −

(

)

8.5 Rate Distortion Theory

Rate distortion theory is concerned with the trade-offs between distortion and rate in lossy

compression schemes. Rate is defined as the average number of bits used to represent each

sample value. One way of representing the trade-offs is via a rate distortion function R(D) .

The rate distortion function R

specifies the lowest rate at which the output of a source can

be encoded while keeping the distortion less than or equal to D . The distortion D is generally

taken to be the expected value of a single letter distortion measure d

(

)

(

x i ,

y j )

, which is a measure

of the difference between x i and y j :

−

[

(

) ]=

(

x i ,

y j )

(

x i ,

y j )

On our way to mathematically defining the rate distortion function, let us look at the rate and

distortion for some different lossy compression schemes.

Introduction to Data Compression

Search WWH ::

Custom Search

Home