Large-Scale Machine Learning - Mining of Massive Datasets

Database Reference

In-Depth Information

is the same as we used in Sections 12.2 and 12.3 to compare a vector x with a hyperplane

perpendicular to a vector w . In fact, if the lines that actually form parts of the Voronoi dia-

gram are preprocessed properly, we can make the determination in O (log n ) comparisons;

it is not necessary to compare q with all of the O ( n log n ) lines that form part of the dia-

gram.

□

12.4.3

Learning One-Dimensional Functions

Another simple and useful case of nearest-neighbor learning has one-dimensional data. In

this situation, the training examples are of the form ([ x ], y ), and we shall write them as ( x ,

y ), identifying a one-dimensional vector with its lone component. In effect, the training set

is a collection of samples of the value of a function y = f ( x ) for certain values of x , and

we must interpolate the function f at all points. There are many rules that could be used,

and we shall only outline some of the popular approaches. As discussed in Section 12.4.1 ,

the approaches vary in the number of neighbors they use, whether or not the neighbors are

weighted, and if so, how the weight varies with distance.

Suppose we use a method with k nearest neighbors, and x is the query point. Let x 1 , x 2 , .

. . , x k be the k nearest neighbors of x , and let the weight associated with training point ( x i ,

y i ) be w i . Then the estimate of the label y for x is Note that this expression gives the

weighted average of the labels of the k nearest neighbors.

EXAMPLE 12.12 We shall illustrate four simple rules, using the training set (1, 1), (2, 2), 3,

4), (4, 8), (5, 4), (6, 2), and (7, 1). These points represent a function that has a peak at x

= 4 and decays exponentially on both sides. Note that this training set has values of x that

are evenly spaced. There is no requirement that the points have any regular pattern. Some

possible ways to interpolate values are:

(1) Nearest Neighbor . Use only the one nearest neighbor. There is no need for a weighting.

Just take the value of any f ( x ) to be the label y associated with the training-set point

nearest to query point x . The result of using this rule on the example training set de-

scribed above is shown in Fig. 12.22(a) .

(2) Average of the Two Nearest Neighbors . Choose 2 as the number of nearest neighbors

to use. The weights of these two are each 1/2, regardless of how far they are from the

query point x . The result of this rule on the example training set is in Fig. 12.22(b) .

(3) Weighted Average of the Two Nearest Neighbors . We again choose two nearest neigh-

bors, but we weight them in inverse proportion to their distance from the query point.

Suppose the two neighbors nearest to query point x are x 1 and x 2 . Suppose first that x 1

Search WWH ::

Custom Search

Home