Information Technology Reference
In-Depth Information
Table 2.2. Upper limits of both hormonal markers
MoM
No MoM
AFP
5
115
5
85
hCG
Both datasets Down_3109 and Down_4995 have differences. We pointed out
above that they are chronologically different, appearing a degradation in the acquisi-
tion of the data. Down_4995 has %TP=40% and %FP=7.71%, whilst Down_3109 has
normal rates, specified in the Introduction section.
The learning/test process has been done by selecting the stratified (the same pro-
portion of data of both classes) 50% of the Down_3109 dataset for learning and the
other 50% for testing. With the best solutions, a second test with the Down_4195 has
been done. Several generations of Fuzzy Systems have been extracted by selecting
different variables. In case of the physical variables the best solutions have been
obtained from using a set of 5 variables (physical: age of the mother, weight of the
mother, gestational age of the fetus, and the measures of two hormones), and in case
of MoM just have been considered 3 variables (age of the mother and the two MoM
hormones). In this case, the fuzzy system has been obtained from a dataset that
matches the results of the age/LR method and that it is also tested with a noisy data-
set, preventing this situation in the future. Thus, we can affirm that the fuzzy system
extracted will be robust under noisy conditions.
The variables of tobacco and alcohol consumption did not produce good results, as
well as the diabetes one. The first two variables may produce noisy due to their value
depends on the truth of the women telling whether the consumption of alcohol is null,
1-5 times a week or 1-5 times a day; and whether they smoke 5 cigarettes or more a
day or they do not smoke. With respect to the diabetes, we have rejected its use due to
noisy problems as well.
Several classification methods have been tested with this dataset in order to discard
the possibility of finding a good solution with them: Neural Networks (BayesNN,
Backpropagation, etc.), classical methods of Fuzzy Rule Extraction and other meth-
ods, like decision trees or SVM 9,10,15,16. The results always were negative, because
of the treatment of the minor-class patterns: either because they did specialize neu-
rons/rules/etc. in the cases belonging to the minor-class or because the minor-class
patterns were ignored (they tried always to match the major-class patterns without
taking into account the minor-class patterns).
In Table 2.3, the best results for the Down's syndrome problem using the FLAGID
method and ordered by %TP (True Positives) are shown. The first two columns refer
to a test done with the Down_4815 dataset not included in the Down_3071 used for
learning. For every % of correct positives found, it is shown the % of false positives,
the type of set with better results (5 or 3 variables), if discarded membership functions
needed, the number of rules found and the %FP (False Positives) and %TP from the
first dataset. The last two columns show the accuracy for the Down_3071 dataset after
testing for whole training and test patterns.
Table 2.3 shows that the results are very similar to those obtained by the age/LR
method (60%-70% TP and 10% FP). Screening methods, as commented at the
Search WWH ::




Custom Search