Information Technology Reference
In-Depth Information
8.7 Conclusions
As follows from our experiments presented in Table 8.7 , rule sets induced by the
LEM2 algorithm outperform rule sets induced by using feature selection (the LEM1
algorithm) in terms of the error rate. The Wilcoxon matched-pairs signed-rank test
indicates that the LEM2 algorithm is better with 5% of significance level (two-tailed
test).
Moreover, the LEM2 algorithm induces much simpler rule sets. As follows from
Table 8.8 , for all 14 data sets, rule sets induced by the LEM2 algorithm are smaller
and the total number of conditions in these rule sets is smaller as well. Simpler rules
are easier to interpret.
Both algorithms, LEM1 and LEM2, are of polynomial time complexity. It is
confirmed by Table 8.9 . The Wilcoxon matched-pairs signed-rank test indicates that
there is no significant difference in run time between the two algorithms. LEM2 can
induce rule sets from data sets with tens of thousands of attributes, such as microarray
data sets, see, e.g., [ 8 - 10 ]. Therefore we may conclude that the LEM2 algorithm,
with the space search of all attribute-value pairs, is better than LEM1 based on feature
selection.
References
1. Bazan, J.G., Szczuka,M.S.,Wojna, A.,Wojnarski,M.: On the evolution of rough set exploration
system. In: Proceedings of the Rough Sets and Current Trends in Computing Conference, pp.
592-601 (2004)
2. Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif.
Intell. 97 , 245-271 (1997)
3. Booker, L.B., Goldberg, D.E., F, H.J.: Classifier systems and genetic algorithms. In: Carbonell,
J.G. (ed.) Machine Learning: Paradigms and Methods, pp. 235-282. MIT, Boston (1990)
4. Chan, C.C., Grzymala-Busse, J.W.: On the attribute redundancy and the learning programs
ID3, PRISM, and LEM2. Technical Report, Department of Computer Science, University of
Kansas (1991)
5. Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as
preprocessing for machine learning. Int. J. Approx. Reason. 15 (4), 319-331 (1996)
6. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1 , 131-156 (1997)
7. Everitt, B.: Cluster Analysis. Heinemann Educational Books, London (1980)
8. Fang, J., Grzymala-Busse, J.: Leukemia prediction from gene expression data—a rough set
approach. In: Proceedings of the Eighth International Conference on Artificial Intelligence and
Soft Computing, pp. 899-908 (2006)
9. Fang, J., Grzymala-Busse, J.: Mining of microRNA expression data—a rough set approach. In:
Proceedings of the First International Conference on Rough Sets and Knowledge Technology,
pp. 758-765 (2006)
10. Fang, J., Grzymala-Busse, J.: Predicting penetration across the blood-brain barrier—a rough
set approach. In: Proceedings of the IEEE International Conference on Granular Computing,
pp. 231-236 (2007)
11. Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In:
Slowinski, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances of
the Rough Set Theory, pp. 3-18. Kluwer Academic Publishers, Dordrecht, Boston (1992)
 
Search WWH ::




Custom Search