Information Technology Reference
In-Depth Information
Fig. 5. The results of dimensionality reduction through linear discriminant analysis
(LDA) into one component. As evident the separability between the two classes healthy
subject and lung cancer patient is very satisfactory. Samples in the red circle are the
misclassified four lung cancer patients erroneously assigned to the healthy class.
Table 2. Confusion matrix and corresponding performance indexes obtained classi-
fying the dataset with the model proposed by the Genetic Algorithm (projection of
the seven features subset into one dimension by LDA and classification by k -NN with
k = 9). Results have been obtained performing a leave-one-subject-out cross-validation
and confidence intervals have been computed for a probability of 95%. Positive samples
correspond to stage I lung cancer patients, while negative samples represent healthy
subjects. Performance are lower than those achieved in the classification lung cancer pa-
tients (all stages) versus healthy subjects , but this is probably due to the small available
dataset regarding the positive class and the imbalance between the two classes.
CONFUSION
TRUE LABELS
Accuracy
92
.
86%
±
8
.
51%
MATRIX
Positive Negative
Sensitivity
75% ± 34 . 31%
ESTIMATED Positive
9
1
Specificity
97 . 73% ± 4 . 57%
LABELS
Negative
3
43
Precision POS
90% ± 40 . 15%
93 . 48% ± 13 . 39%
Total
12
44
Precision NEG
This could be due both to the small available dataset (only 12 subjects affected
by stage I lung cancer) and to the imbalance of the dataset (12 stage I lung
cancer versus 44 healthy subjects). Regardless this analysis, the achieved results
suggest that the discrimination between stage I lung cancer patient and healthy
Search WWH ::




Custom Search