Mathematical Preliminaries for Lossy Coding - Introduction to Data Compression

Databases Reference

In-Depth Information

then find

and examine the difference between them. There are two problems

with this approach. First, the process of human perception is inferential and very difficult to

model. Second, even if we could find a mathematical model for perception, the odds are that

it would be so complex that it would be mathematically intractable.

In spite of these disheartening prospects, the study of perception mechanisms is still im-

portant from the perspective of design and analysis of compression systems. Even if we cannot

obtain a transformation that accurately models perception, we can learn something about the

properties of perception that may come in handy in the design of compression systems. In the

following, we will look at some of the properties of the human visual system and the perception

of sound. Our reviewwill be far from thorough, but the intent here is to present some properties

that will be useful in later chapters when we talk about compression of images, video, speech,

and audio.

V(

x

)

and

V(

y

)

8.3.1 The Human Visual System

The eye is a globe-shaped object with a lens in the front that focuses objects onto the retina

in the back of the eye. The retina contains two kinds of receptors, called rods and cones .The

rods are more sensitive to light than cones, and in low light most of our vision is due to the

operation of rods. There are three kinds of cones, each of which is most sensitive at a different

wavelength of the visible spectrum. The peak sensitivities of the cones are in the red, blue,

and green regions of the visible spectrum [ 104 ]. The cones are mostly concentrated in a very

small area of the retina called the fovea . Although the rods are more numerous than the cones,

the cones provide better resolution because they are more closely packed in the fovea. The

muscles of the eye move the eyeball, positioning the image of the object on the fovea. This

becomes a drawback in low light. One way to improve what you see in low light is to focus to

one side of the object. This way the object is imaged on the rods, which are more sensitive to

light.

The eye is sensitive to light over an enormously large range of intensities; the upper end of

the range is about 10 10 times the lower end of the range. However, at a given instant we cannot

perceive the entire range of brightness. Instead, the eye adapts to an average brightness level.

The range of brightness levels that the eye can perceive at any given instant is much smaller

than the total range it is capable of perceiving.

If we illuminate a screen with a certain intensity I and shine a spot on it with different

intensity, the spot becomes visible when the difference in intensity is

I . This is called the

just noticeable difference (jnd). The ratio I I is known as the Weber fraction or Weber ratio .

This ratio is known to be constant at about 0.02 over a wide range of intensities in the absence

of background illumination. However, if the background illumination is changed, the range

over which the Weber ratio remains constant becomes relatively small. The constant range is

centered around the intensity level to which the eye adapts.

If I I is constant, then we can infer that the sensitivity of the eye to intensity is a logarithmic

function ( d

I ). Thus, we can model the eye as a receptor whose output goes to a

logarithmic nonlinearity. We also know that the eye acts as a spatial low-pass filter [ 105 , 106 ].

Putting all of this information together, we can develop a model for monochromatic vision,

shown in Figure 8.3 .

(

log I

) =

dI

/

Introduction to Data Compression

Search WWH ::

Custom Search

Home