Database Reference
In-Depth Information
then the second term of Equation 12.4 will always be 0, which simplifies your work
considerably.
! EXERCISE 12.3.3 The following training set obeys the rule that the positive examples all
have vectors whose components have an odd sum, while the sum is even for the negative
examples.
([1, 2], +1) ([3, 4], +1) ([5, 2], +1)
([2, 4], −1) ([3, 1], −1) ([7, 3], −1)
(a) Suggest a starting vector w and constant b that classifies at least three of the points
correctly.
!! (b) Starting with your answer to (a), use gradient descent to find the optimum w and b .
12.4 Learning from Nearest Neighbors
In this section we consider several examples of “learning,” where the entire training set is
stored, perhaps preprocessed in some useful way, and then used to classify future examples
or to compute the value of the label that is most likely associated with the example. The
feature vector of each training example is treated as a data point in some space. When a
new point arrives and must be classified, we find the training example or examples that are
closest to the new point, according to the distance measure for that space. The estimated
label is then computed by combining the closest examples in some way.
12.4.1
The Framework for Nearest-Neighbor Calculations
The training set is first preprocessed and stored. The decisions take place when a new ex-
ample, called the query example , arrives and must be classified.
There are several decisions we must make in order to design a nearest-neighbor-based
algorithm that will classify query examples. We enumerate them here:
(1) What distance measure do we use?
(2) How many of the nearest neighbors do we look at?
(3) How do we weight the nearest neighbors? Normally, we provide a function (the kernel
function ) of the distance between the query example and its nearest neighbors in the
training set, and use this function to weight the neighbors.
Search WWH ::




Custom Search