Database Reference
In-Depth Information
<
x
<
x
2
. Then the weight of
x
1
, the inverse of its distance from
x
, is 1/(
x
−
x
1
), and the
weight of
x
2
is 1/(
x
2
−
x
). The weighted average of the labels is
which, when we multiply numerator and denominator by (
x
−
x
1
)(
x
2
−
x
), simplifies to
This expression is the linear interpolation of the two nearest neighbors, as shown in
Fig. 12.23(a)
. When both nearest neighbors are on the same side of the query
x
, the
same weights make sense, and the resulting estimate is an
extrapolation
. We see ex-
trapolation in
Fig. 12.23(a)
in the range
x
= 0 to
x
= 1. In general, when points are
unevenly spaced, we can find query points in the interior where both neighbors are on
one side.
(4)
Average of Three Nearest Neighbors
. We can average any number of the nearest
neighbors to estimate the label of a query point.
Figure 12.23(b)
shows what happens on our example training set when the three nearest neighbors are
used.
□
Figure 12.22
Results of applying the first two rules in
Example 12.12
Figure 12.23
Results of applying the last two rules in
Example 12.12
12.4.4
Kernel Regression
A way to construct a continuous function that represents the data of a training set well is
to consider all points in the training set, but weight the points using a kernel function that