Information Technology Reference
In-Depth Information
Table 1. Confusion matrix and corresponding performance indexes obtained classifying
the dataset with the model proposed by the Genetic Algorithm (projection of the four
features subset into one dimension by LDA and classification by k -NN with k =9).
Results have been obtained performing a leave-one-subject-out cross-validation and
confidence intervals have been computed for a probability of 95%. Positive samples
correspond to diseased subjects, while negative samples represent healthy volunteers.
CONFUSION
TRUE LABELS
Accuracy
96 . 15% ± 5 . 41%
MATRIX
Positive Negative
Sensitivity
93 . 33% ± 9 . 3%
100% ± 0%
ESTIMATED Positive
56
0
Specificity
LABELS
Negative
4
44
Precision POS
100% ± 0%
Total
60
44
Precision NEG
91 . 66% ± 11 . 57%
of 92 . 6%, sensitivity of 95 . 3% and specificity of 90 . 5% (on 58 control subjects
and 43 lung cancer subjects) [1]. A very interesting consideration regards the
fact that three of the four found best features are extracted from the same sen-
sor, meaning that those two sensors are enough to diagnose lung cancer with the
excellent performance indexes showed in Table 1. Evaluating the performance
reached keeping the features extracted only by one sensor (the one correspond-
ing to the best three features) results have been very satisfactory, reaching an
average accuracy of 90.38%, sensitivity of 83.33% and specificity of 100%. All the
four best features were derived from the transient response, being its integral,
its derivative, the difference between the baseline and the steady-state and the
ratio between the baseline and the steady-state. This confirms the theory that
the transient response contains useful information related to the dynamics of
the phenomenon. The best component projected by means of LDA is shown in
Figure 5, where the discrimination between the two classes lung cancer patient
and healthy subject is evident. Finally the best classifier has turned out to be
the k -NN with k =9. However, performing a Student's t-test between all pair of
considered models, no significative differences emerged, suggesting that all com-
putational intelligence methods that we have applied provided satisfying results.
It has to be noticed that two of the four misclassified samples correspond to
a subject which diagnosis by PET failed too. This suggests that some singular
physiological parameters could be present in that patient.
We also investigated the possibility of performing early diagnosis, training
the model to distinguish among the class stage I lung cancer patient and healthy
subjects . The followed approach has been the same used for the classification
between lung cancer (all the stages) versus healthy subjects : the genetic algorithm
searched the best combination of feature subset, feature projection and classifier
and provided a new subset of features (composed of 7 features) and the same
projection and classification algorithms (LDA and k -NN with k =9)asinthe
lung cancer diagnosis task. Results are showed in Table 2. Although these results
are relatively satisfactory, their confidence intervals are not that compact and,
anyway, are larger than those achieved in the lung cancer vs healthy classification.
 
Search WWH ::




Custom Search