A Comparison of Rule Induction Using Feature Selection and the LEM2 Algorithm - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

8.7 Conclusions

As follows from our experiments presented in Table 8.7 , rule sets induced by the

LEM2 algorithm outperform rule sets induced by using feature selection (the LEM1

algorithm) in terms of the error rate. The Wilcoxon matched-pairs signed-rank test

indicates that the LEM2 algorithm is better with 5% of significance level (two-tailed

test).

Moreover, the LEM2 algorithm induces much simpler rule sets. As follows from

Table 8.8 , for all 14 data sets, rule sets induced by the LEM2 algorithm are smaller

and the total number of conditions in these rule sets is smaller as well. Simpler rules

are easier to interpret.

Both algorithms, LEM1 and LEM2, are of polynomial time complexity. It is

confirmed by Table 8.9 . The Wilcoxon matched-pairs signed-rank test indicates that

there is no significant difference in run time between the two algorithms. LEM2 can

induce rule sets from data sets with tens of thousands of attributes, such as microarray

data sets, see, e.g., [ 8 - 10 ]. Therefore we may conclude that the LEM2 algorithm,

with the space search of all attribute-value pairs, is better than LEM1 based on feature

selection.

References

1. Bazan, J.G., Szczuka,M.S.,Wojna, A.,Wojnarski,M.: On the evolution of rough set exploration

system. In: Proceedings of the Rough Sets and Current Trends in Computing Conference, pp.

592-601 (2004)

2. Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif.

Intell. 97 , 245-271 (1997)

3. Booker, L.B., Goldberg, D.E., F, H.J.: Classifier systems and genetic algorithms. In: Carbonell,

J.G. (ed.) Machine Learning: Paradigms and Methods, pp. 235-282. MIT, Boston (1990)

4. Chan, C.C., Grzymala-Busse, J.W.: On the attribute redundancy and the learning programs

ID3, PRISM, and LEM2. Technical Report, Department of Computer Science, University of

Kansas (1991)

5. Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as

preprocessing for machine learning. Int. J. Approx. Reason. 15 (4), 319-331 (1996)

6. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1 , 131-156 (1997)

7. Everitt, B.: Cluster Analysis. Heinemann Educational Books, London (1980)

8. Fang, J., Grzymala-Busse, J.: Leukemia prediction from gene expression data—a rough set

approach. In: Proceedings of the Eighth International Conference on Artificial Intelligence and

Soft Computing, pp. 899-908 (2006)

9. Fang, J., Grzymala-Busse, J.: Mining of microRNA expression data—a rough set approach. In:

Proceedings of the First International Conference on Rough Sets and Knowledge Technology,

pp. 758-765 (2006)

10. Fang, J., Grzymala-Busse, J.: Predicting penetration across the blood-brain barrier—a rough

set approach. In: Proceedings of the IEEE International Conference on Granular Computing,

pp. 231-236 (2007)

11. Grzymala-Busse, J.W.: LERS—a system for learning from examples based on rough sets. In:

Slowinski, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances of

the Rough Set Theory, pp. 3-18. Kluwer Academic Publishers, Dordrecht, Boston (1992)

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home