Database Reference
In-Depth Information
Figure 12.1 Repeat of Fig. 7.1 , indicating the heights and weights of certain dogs
An appropriate way to implement the decision function f would be to imagine two lines,
shown dashed in Fig. 12.1 . The horizontal line represents a height of 7 inches and separ-
ates Beagles from Chihuahuas and Dachshunds. The vertical line represents a weight of 3
pounds and separates Chihuahuas from Beagles and Dachshunds. The algorithm that im-
plements f is:
if (height > 7) print Beagle
else if (weight < 3) print Chihuahua
else print Dachshund;
Recall that the original intent of Fig. 7.1 was to cluster points without knowing which
variety of dog they represented. That is, the label associated with a given height-weight
vector was not available. Here, we are performing supervised learning with the same data
augmented by classifications for the training data.
EXAMPLE 12.2 As an example of supervised learning, the four points (0, 2), (2, 1), (3, 4),
and (4, 3) from Fig.11.1 (repeated here as Fig. 12.2 ), can be thought of as a training set,
where the vectors are one-dimensional. That is, the point (1, 2) can be thought of as a pair
([1], 2), where [ 1 ] is the one-dimensional feature vector x , and 2 is the associated label y ;
the other points can be interpreted similarly.
Figure 12.2 Repeat of Fig. 11.1 , to be used as a training set
Suppose we want to “learn” the linear function f ( x ) = ax + b that best rep resents the
points of the training set. A natural interpretation of “best” is that the RMSE of the value
of f ( x ) compared with the given value of y is minimized. That is, we want to minimize
Search WWH ::




Custom Search