Classification: Advanced Methods - Data Mining: Concepts and Techniques

Databases Reference

In-Depth Information

9.2.2 Defining a Network Topology

“ How can I design the neural network's topology? ” Before training can begin, the user

must decide on the network topology by specifying the number of units in the input

layer, the number of hidden layers (if more than one), the number of units in each

hidden layer, and the number of units in the output layer.

Normalizing the input values for each attribute measured in the training tuples will

help speed up the learning phase. Typically, input values are normalized so as to fall

between 0.0 and 1.0. Discrete-valued attributes may be encoded such that there is one

input unit per domain value. For example, if an attribute A has three possible or known

values, namely f a 0 , a 1 , a 2 g, then we may assign three input units to represent A . That

is, we may have, say, I 0 , I 1 , I 2 as input units. Each unit is initialized to 0. If A D a 0 , then

I 0 is set to 1 and the rest are 0. If A D a 1 , then I 1 is set to 1 and the rest are 0, and

so on.

Neural networks can be used for both classification (to predict the class label of a

given tuple) and numeric prediction (to predict a continuous-valued output). For clas-

sification, one output unit may be used to represent two classes (where the value 1

represents one class, and the value 0 represents the other). If there are more than two

classes, then one output unit per class is used. (See Section 9.7.1 for more strategies on

multiclass classification.)

There are no clear rules as to the “best” number of hidden layer units. Network design

is a trial-and-error process and may affect the accuracy of the resulting trained net-

work. The initial values of the weights may also affect the resulting accuracy. Once a

network has been trained and its accuracy is not considered acceptable, it is common to

repeat the training process with a different network topology or a different set of initial

weights. Cross-validation techniques for accuracy estimation (described in Chapter 8)

can be used to help decide when an acceptable network has been found. A number of

automated techniques have been proposed that search for a “good” network structure.

These typically use a hill-climbing approach that starts with an initial structure that is

selectively modified.

9.2.3 Backpropagation

“ How does backpropagation work? ” Backpropagation learns by iteratively processing a

data set of training tuples, comparing the network's prediction for each tuple with the

actual known target value. The target value may be the known class label of the training

tuple (for classification problems) or a continuous value (for numeric prediction). For

each training tuple, the weights are modified so as to minimize the mean-squared error

between the network's prediction and the actual target value. These modifications are

made in the “backwards” direction (i.e., from the output layer) through each hidden

layer down to the first hidden layer (hence the name backpropagation ). Although it is

not guaranteed, in general the weights will eventually converge, and the learning process

stops. The algorithm is summarized in Figure 9.3. The steps involved are expressed in

terms of inputs, outputs, and errors, and may seem awkward if this is your first look at

Search WWH ::

Custom Search

Home