Information Technology Reference
In-Depth Information
The main issue of defining a classifier system concerns the learning method to
adopt. In general, a decision rule which optimally partitions the measurement
space into K regions, one for each class C , must be detected; the boundaries
between regions are called decision boundaries. In the lung cancer diagnosis by
the electronic nose task, we considered three families of classifiers: a number of
different versions of the k -Nearest Neighbors classifiers ( k -NN) (classic, modified
and fuzzy k -NN), Linear and Quadratic Discriminant function classifiers (respec-
tively LD and QD) and an Artificial Neural Network (ANN). In the following a
brief overview of these algorithms and of other computational intelligence tech-
niques used in olfactory signal analysis is provided.
K
-Nearest Neighbors method. The basic idea behind this simple and pow-
erful algorithm is to assign the input sample to the class to which belongs the
majority of the k closest samples in the training set. This method is able to do a
nonlinear classification starting from a small number of samples. The algorithm
is based on a measure of the distance (e.g., the Euclidean one) among features
and it has been demonstrated [12], that the k -NN is formally a non parametric
approximation of the Maximum A Posteriori (MAP) criterion. The asymptotic
performance of this algorithm, is almost optimal: with an infinite number of
training samples and setting k = 1, the minimum error is never higher than
twice the Bayesian error (that is the theoretical lower bound reachable) [10].
One of the most critical aspects of this method regards the choice of parameter
k when having a limited number of samples: if k is too large, then the problem
is too much simplified and the local information loses its relevance. On the other
hand, a too small k leads to a density estimation too sensitive to outliers. A
small variation of the classic k -NN is the modified k -NN, where k represents the
number of closest neighbors to look for (as in the classic k -NN), but all belonging
to the same class. This dynamically modifies the neighborhood according to the
noise in the dataset.
Discriminant Functions classifiers (DF). Classification based on discrimi-
nant functions represents a geometric approach where the feature space is divided
in c decision regions each one corresponding to a particular class. The classifier is
represented as a family of discriminant functions g i ( x ) with only one output that
minimizes a given cost function. In DF analysis it is assumed that the data are
distributed as a multivariate Gaussian. In our work, we considered two types of
discriminant functions: the linear (LD) and the quadratic one (QD). A classifier
based on a linear discriminant function divides the feature space by planes and it
is therefore optimum when the problem is linearly separable. However, this tech-
nique is able to produce good performance also when the problem is not linearly
separable. We implemented the Minimum Distance to Means (MDM) approach, in
which the representatives of each class are calculated as the mean value of samples
belonging to that class. This approach is very simple and leads to good generaliza-
tion; the drawback is that it compresses all the information in only one represen-
tative value. If the problem is not linearly separable, a quadratic discrimination
function might be more suitable, as it has been verified also in this work.
 
Search WWH ::




Custom Search