Large-Scale Machine Learning - Mining of Massive Datasets

Database Reference

In-Depth Information

12.1.3

Approaches to Machine Learning

There are many forms of ML algorithms, and we shall not cover them all here. Here are the

major classes of such algorithms, each of which is distinguished by the form by which the

function f is represented.

(1) Decision trees were discussed briefly in Section 9.2.7 . The form of f is a tree, and

each node of the tree has a function of x that determines to which child or children the

search must proceed. While we saw only binary trees in Section 9.2.7 , in general a de-

cision tree can have any number of children for each node. Decision trees are suitable

for binary and multiclass classification, especially when the dimension of the feature

vector is not too large (large numbers of features can lead to overfitting).

(2) Perceptrons are threshold functions applied to the components of the vector x = [ x 1 ,

x 2 , . . . , x n ]. A weight w i is associated with the i th component, for each i = 1, 2, . . . , n ,

and there is a threshold θ . The output is +1 if

and the output is −1 otherwise. A perceptron is suitable for binary classification, even

when the number of features is very large, e.g., the presence or absence of words in a

document. Perceptrons are the topic of Section 12.2 .

(3) Neural nets are acyclic networks of perceptrons, with the outputs of some perceptrons

used as inputs to others. These are suitable for binary or multiclass classification, since

there can be several perceptrons used as output, with one or more indicating each class.

(4) Instance-based learning uses the entire training set to represent the function f . The

calculation of the label y associated with a new feature vector x can involve examina-

tion of the entire training set, although usually some preprocessing of the training set

enables the computation of f ( x ) to proceed efficiently. We shall consider an import-

ant kind of instance-based learning, k -nearest-neighbor, in Section 12.4 . For example,

1-nearest-neighbor classifies data by giving it the same class as that of its nearest train-

ing example. There are k -nearest-neighbor algorithms that are appropriate for any kind

of classification, although we shall concentrate on the case where y and the compon-

ents of x are real numbers.

(5) Support-vector machines are an advance over the algorithms traditionally used to se-

lect the weights and threshold. The result is a classifier that tends to be more accurate

on unseen data. We discuss support-vector machines in Section 12.3 .

Search WWH ::

Custom Search

Home