Information Technology Reference
In-Depth Information
changes in the new chromosomes (mutation) to avoid local minima and allowing
a more extensive exploration of the solution space. The goodness of a solution
is evaluated through the so called fitness function, that is the objective function
to maximize.
This process leads to the evolution of populations of individuals that are
better suited to their environment than the individuals from which they were
created, mimicking natural adaptation. GAs, if properly coded, can be used to
solve a wide range of problems, such as optimization (e.g., circuits layout, job
shop scheduling), prediction (e.g., weather forecast, protein folding), classifica-
tion (e.g., fraud detection, quality assessment), economy (e.g., bidding strategies,
market evaluation), ecology (e.g., biological arm races, host-parasite coevolution)
and automatic programming. In the electronic nose field they have been widely
used, in particular to perform feature selection, to find the best classifier pa-
rameters and to indentify the best architecture and topology for the specific
algorithm.
In our work, we implemented a genetic algorithm to find the best combination
of feature selection, feature projection, classifier and its parameters. As previ-
ously mentioned, the quality of a classifier depends on the feature matrix on
which it is applied, and different classifiers may have different optimal feature
matrices, that is, different optimal feature subsets and projections. The imple-
mented genetic algorithm was binary coded, and each chromosome included
information about the specific features to keep, the projection algorithm to
be applied (choosing among LDA, NPLDA and PCA), the classifier to adopt
(choosing among k -NN, modified k -NN, Fuzzy k -NN, linear discriminant func-
tion, quadratic discriminant function and a feed-forward artificial neural net-
work) and the corresponding parameters (e.g., k value, number of hidden layers,
etc.). The population was composed of 100 chromosomes. At each generation,
the best solutions were chosen according to the roulette wheel selection rule and
the reproduction operation was performed by means of a scattered crossover and
a Gaussian mutation. Elitism was adopted in order to improve the avoidance of
local minima and to assure a monotonic fitness function of the best chromosomes
at each generation. The fitness function was evaluated as a function proportional
to the product of the mean squared error and the variance obtained perform-
ing the classification with the feature subset, the projection and the classifier
encoded in the considered chromosome.
The best model provided by the GA, was a subset composed of four features,
projected into a single component using the Linear Discriminant Analysis (LDA)
and finally classified by means of the the k Nearest Neighbours ( k -NN) method,
with k =9.
9 Validation
The previous sections have briefly reviewed a number of pattern analysis tech-
niques used in the olfactory signal analysis and, more in specific, in the lung
cancer diagnosis by an electronic nose application. This section addresses the
Search WWH ::




Custom Search