Databases Reference
In-Depth Information
V j can then be defined as
V j
={
x
:
d
(
x
,
y j )<
d
(
x
,
y i )
i
=
j
}
(4)
Thus, the quantizer is completely defined by the output points and a distortion measure.
From a multidimensional point of view, using a scalar quantizer for each input restricts
the output points to a rectangular grid. Observing several source output values at once allows
us to move the output points around. Another way of looking at this is that in one dimension
the quantization intervals are restricted to be intervals, and the only parameter that we can
manipulate is the size of these intervals. When we divide the input into vectors of some length
n , the quantization regions are no longer restricted to be rectangles or squares. We have the
freedom to divide the range of the inputs in an infinite number of ways.
These examples have shown two ways in which the vector quantizer can be used to improve
performance. In the first case, we exploited the sample-to-sample dependence of the input. In
the second case, there was no sample-to-sample dependence; the samples were independent.
However, looking at two samples together still improved performance.
These two examples can be used to motivate two somewhat different approaches toward
vector quantization. One approach is a pattern-matching approach, similar to the process used
in Example 10.3.1 , while the other approach deals with the quantization of random inputs. We
will look at both of these approaches in this chapter.
10.4 The Linde-Buzo-Gray Algorithm
In Example 10.3.1 we saw that one way of exploiting the structure in the source output is to
place the quantizer output points where the source output points (blocked into vectors) are most
likely to congregate. The set of quantizer output points is called the codebook of the quantizer,
and the process of placing these output points is often referred to as codebook design . When
we group the source output in two-dimensional vectors, as in the case of Example 10.3.1 ,
we might be able to obtain a good codebook design by plotting a representative set of source
output points and then visually locating where the quantizer output points should be. However,
this approach to codebook design breaks down when we design higher-dimensional vector
quantizers. Consider designing the codebook for a 16-dimensional quantizer. Obviously, a
visual placement approach will not work in this case. We need an automatic procedure for
locating where the source outputs are clustered.
This is a familiar problem in the field of pattern recognition. It is no surprise, therefore,
that the most popular approach to designing vector quantizers is a clustering procedure known
as the k -means algorithm, which was developed for pattern recognition applications.
The k -means algorithm functions as follows: Given a large set of output vectors from the
source, known as the training set , and an initial set of k representative patterns, assign each
element of the training set to the closest representative pattern. After an element is assigned,
the representative pattern is updated by computing the centroid of the training set vectors
assigned to it. When the assignment process is complete, we will have k groups of vectors
clustered around each of the output points.
Search WWH ::




Custom Search