SOFT COMPUTING FOR FEATURE SELECTION - Knowledge Mining Using Intelligent Agents

Databases Reference

In-Depth Information

consider initially a limited set of candidate features, and consider new ones

incrementally until an optimal subset is selected. The proposed algorithm

proceeds in an iterative manner: it receives at each time step either a new

example or a new feature or both, and adjusts the current set of selected

features. When a new feature is provided, the algorithm makes a decision

regarding whether to substitute an existing feature with the new one, or

maintain the current set of features, according to a value computed for

each feature relative to the current feature set. The value of each feature is

evaluated 73 by the amount of class information it contributes to the features

in the selected set. The algorithm also keeps a fixed-size set of the most

recent examples, used to evaluate newly provided features. In this way, the

evaluation time of the features value, which depends only on the number

of examples and the number of features in the selected set, is constant

throughout the learning. Given a feature f and a set of selected features

S, the desired merit value MV ( f ; S ) 74 should express the additional class

information gained by adding f to S . This can be measured using mutual

information by:

MV ( f ; S )= I ( f ; S ; C )

−

I ( S ; C ) ,

(8.6)

where I stands for mutual information.

8.2.10. Multiclass MTS for simultaneous feature selection

and classification

Here the important features are identified using the orthogonal arrays

and the signal-to-noise ratio, and are then used to construct a reduced

model measurement scale. Mahalanobis distance and Taguchi's robust

engineering. 75 Mahalanobis distance is used to construct a multidimensional

measurement scale and define a reference point of the scale with a set

of observations from a reference group. Taguchi's robust engineering is

applied to determine the important features and then optimize the system.

The goal of multiclass classification problems 76,77 is to find a mapping or

function, C i = f ( X ), that can predict the associated class label C

( i )ofa

given example vector X . Thus, it is expected that the mapping or function

can accurately separate the data classes. MTS is different from classical

multivariate methods in the following ways. 78,79 First, the methods used in

MTS are data analytic instead of probability-based inference. This means

that MTS does not require any assumptions on the distribution of input

−

Search WWH ::

Custom Search

Home