Database Reference
In-Depth Information
is the same as we used in Sections 12.2 and 12.3 to compare a vector x with a hyperplane
perpendicular to a vector w . In fact, if the lines that actually form parts of the Voronoi dia-
gram are preprocessed properly, we can make the determination in O (log n ) comparisons;
it is not necessary to compare q with all of the O ( n log n ) lines that form part of the dia-
gram.
12.4.3
Learning One-Dimensional Functions
Another simple and useful case of nearest-neighbor learning has one-dimensional data. In
this situation, the training examples are of the form ([ x ], y ), and we shall write them as ( x ,
y ), identifying a one-dimensional vector with its lone component. In effect, the training set
is a collection of samples of the value of a function y = f ( x ) for certain values of x , and
we must interpolate the function f at all points. There are many rules that could be used,
and we shall only outline some of the popular approaches. As discussed in Section 12.4.1 ,
the approaches vary in the number of neighbors they use, whether or not the neighbors are
weighted, and if so, how the weight varies with distance.
Suppose we use a method with k nearest neighbors, and x is the query point. Let x 1 , x 2 , .
. . , x k be the k nearest neighbors of x , and let the weight associated with training point ( x i ,
y i ) be w i . Then the estimate of the label y for x is Note that this expression gives the
weighted average of the labels of the k nearest neighbors.
EXAMPLE 12.12 We shall illustrate four simple rules, using the training set (1, 1), (2, 2), 3,
4), (4, 8), (5, 4), (6, 2), and (7, 1). These points represent a function that has a peak at x
= 4 and decays exponentially on both sides. Note that this training set has values of x that
are evenly spaced. There is no requirement that the points have any regular pattern. Some
possible ways to interpolate values are:
(1) Nearest Neighbor . Use only the one nearest neighbor. There is no need for a weighting.
Just take the value of any f ( x ) to be the label y associated with the training-set point
nearest to query point x . The result of using this rule on the example training set de-
scribed above is shown in Fig. 12.22(a) .
(2) Average of the Two Nearest Neighbors . Choose 2 as the number of nearest neighbors
to use. The weights of these two are each 1/2, regardless of how far they are from the
query point x . The result of this rule on the example training set is in Fig. 12.22(b) .
(3) Weighted Average of the Two Nearest Neighbors . We again choose two nearest neigh-
bors, but we weight them in inverse proportion to their distance from the query point.
Suppose the two neighbors nearest to query point x are x 1 and x 2 . Suppose first that x 1
Search WWH ::




Custom Search