Database Reference
In-Depth Information
decays with distance. A popular choice is to use a normal distribution (or “bell curve”), so
the weight of a training point x when the query is q is e ( x−q ) 2 / σ 2 . Here σ is the standard de-
viation of the distribution and the query q is the mean. Roughly, points within distance σ of
q are heavily weighted, and those further away have little weight. The advantage of using
a kernel function that is itself continuous and that is defined for all points in the training
set is to be sure that the resulting function learned from the data is itself continuous (see
Exercise 12.4.6 for a discussion of the problem when a simpler weighting is used).
EXAMPLE 12.13 Let us use the seven training examples of Example 12.12 . To make cal-
culation simpler, we shall not use the normal distribution as the kernel function, but rather
another continuous function of distance, namely w = 1/( x q ) 2 . That is, weights decay as
the square of the distance. Suppose the query q is 3.5. The weights w 1 , w 2 , . . . w 7 of the
seven training examples ( x i , y i ) = ( i , 8/2 |i− 4| ) for i = 1, 2, . . . , 7 are shown in Fig. 12.24 .
Figure 12.24 Weights of points when the query is q = 3.5
Lines (1) and (2) of Fig. 12.24 give the seven training points. The weight of each when
the query is q = 3.5 is given in line (3). For instance, for x 1 = 1, the weight w 1 = 1/(1 − 3.5) 2
= 1/(−2.5) 2 = 4/25. Then, line (4) shows each y i weighted by the weight from line (3). For
instance, the column for x 2 has value 8/9 because w 2 y 2 = 2 × (4/9).
Problems in the Limit for Example 12.13
Suppose q is exactly equal to one of the training examples x . If we use the normal distribution as the kernel function,
there is no problem with the weight of x : it is 1. However, with the kernel function discussed in Example 12.13 , the
weight of x is 1/( x q ) 2 = ∞. Fortunately, this weight appears in both the numerator and denominator of the expres-
sion that estimates the label of q . It can be shown that in the limit as q approaches x , the label of x dominates all the
other terms in both numerator and denominator, so the estimated label of q is the same as the label of x . That makes
excellent sense, since q = x in the limit.
To compute the label for the query q = 3.5 we sum the weighted values of the labels in
the training set, as given by line (4) of Fig. 12.24 ; this sum is 51.23. We then divide by the
sum of the weights in line (3). This sum is 9.29, so the ratio is 51.23/9.29 = 5.51. That es-
timate of the value of the label for q = 3.5 seems intuitively reasonable, since q lies midway
between two points with labels 4 and 8.
Search WWH ::




Custom Search