Digital Signal Processing Reference
In-Depth Information
In many cases files or data streams contain more information than is needed for a
particular purpose. For example, a picture may have more detail than the eye can
distinguish when reproduced at the largest size intended; an audio file does not need a lot
of fine detail during a very loud passage. Developing lossy compression techniques as
closely matched to human perception as possible is a complex task. In some cases the
ideal is a file which provides exactly the same perception as the original, with as much
digital information as possible removed; in other cases perceptible loss of quality is
considered a valid trade-off for the reduced data size.
Transform coding
More generally, lossy compression can be thought of as an application of transform
coding - in the case of multimedia data, perceptual coding: it transforms the raw data to
a domain that more accurately reflects the information content. For example, rather than
expressing a sound file as the amplitude levels over time, one may express it as the
frequency spectrum over time, which corresponds more accurately to human audio
perception.
While data reduction (compression, be it lossy or lossless) is a main goal of transform
coding, it also allows other goals: one may represent data more accurately for the original
amount of space - for example, in principle, if one starts with an analog or high-
resolution digital master, an MP3 file of a given bitrate (e.g. 320 kbit/s) should provide a
better representation than a raw uncompressed audio in WAV or AIFF file of the same
bitrate. (Uncompressed audio can get lower bitrate only by lowering sampling frequency
and/or sampling resolution.) Further, a transform coding may provide a better domain for
manipulating or otherwise editing the data - for example, equalization of audio is most
naturally expressed in the frequency domain (boost the bass, for instance) rather than in
the raw time domain.
From this point of view, perceptual encoding is not essentially about discarding data, but
rather about a better representation of data.
Another use is for backward compatibility and graceful degradation: in color television,
encoding color via a luminance-chrominance transform domain (such as YUV) means
that black-and-white sets display the luminance, while ignoring the color information.
Another example is chroma subsampling: the use of color spaces such as YIQ, used in
NTSC, allow one to reduce the resolution on the components to accord with human
perception - humans have highest resolution for black-and-white (luma), lower resolution
for mid-spectrum colors like yellow and green, and lowest for red and blues - thus NTSC
displays approximately 350 pixels of luma per scanline, 150 pixels of yellow vs. green,
and 50 pixels of blue vs. red, which are proportional to human sensitivity to each
component.
Information loss
Search WWH ::




Custom Search