Information Technology Reference
In-Depth Information
Chapter 18
Pattern Classification Techniques for Lung
Cancer Diagnosis by an Electronic Nose
Rossella Blatt, Andrea Bonarini, and Matteo Matteucci
Politecnico di Milano, Dipartimento di Elettronica e Informazione,
Via Ponzio 34/5 20133 Milan, Italy
rblatt@iit.edu , { Bonarini,Matteucci } @elet.polimi.it
Abstract. Computational intelligence techniques can be implemented
to analyze the olfactory signal as perceived by an electronic nose, and
to detect information to diagnose a multitude of human diseases. Our
research suggests the use of an electronic nose to diagnose lung cancer.
An electronic nose is able to acquire and recognize the volatile organic
compounds (VOCs) present in the analyzed substance: it is composed
of an array of electronic, chemical sensors, and a pattern classification
module based on computational intelligence techniques. The three main
stages characterizing the basic functioning of an electronic nose are: ac-
quisition, preprocessing and pattern analysis. In the lung cancer detec-
tion experimentation, we analyzed 104 breath samples of 52 subjects, 22
healthy subjects and 30 patients with primary lung cancer at different
stages. In order to find the best classification model able to discrimi-
nate between the two classes healthy and lung cancer subjects, and to
reduce the dimensionality of the problem, we implemented a genetic al-
gorithm (GA) that can find the best combination of feature selection,
feature projection and classifier algorithms to be used. In particular, for
feature projection, we considered Principal Component Analysis (PCA),
Fisher Linear Discriminant Analysis (LDA) and Non Parametric Linear
Discriminant Analysis (NPLDA); classification has been performed im-
plementing several supervised pattern classification algorithms, based on
different k-Nearest Neighbors (k-NN) approaches (classic, modified and
fuzzy k-NN), on linear and quadratic discriminant functions classifiers
and on a feed-forward Artificial Neural Network (ANN). The best solu-
tion provided from the genetic algorithm has been the projection of a
subset of features into a single component using the Fisher Linear Dis-
criminant Analysis and a classification based on the k-Nearest Neighbors
method. The observed results, all validated using cross-validation, have
been excellent achieving an average accuracy of 96.2%, an average sensi-
tivity of 93.3% and an average specificity of 100%, as well as very small
confidence intervals. We also investigated the possibility of performing
early diagnosis, building a model able to predict a sample belonging to
a subject with primary lung cancer at stage I compared to healthy sub-
jects. Also in this analysis results have been very satisfactory, achieving
an average accuracy of 92.85%, an average sensitivity of 75.5% and an
average specificity of 97.72%. The achieved results demonstrate that the
 
Search WWH ::




Custom Search