Information Technology Reference
In-Depth Information
methods on the Down's syndrome detection problem. Furthermore, it has been proven
to be competitive with the state of the art of classifiers for imbalanced datasets, on
UCI repository.
This method extracts information in the dataset, and expresses it in a fuzzy system.
This information is expressed in a small number of rules, as can be seen in tables 2.3
and 2.4. This fact means that the rules are not specialized in cases of the minor-class,
but they are distributed among both classes. Normally, we can find more or many
more rules belonging to the major-class than to the minor-class. This was one of the
goals to be achieved by the method.
Finally, we can conclude that the new method presented in this chapter and called
FLAGID (Fuzzy Logic And Genetic algorithms for Imbalanced Datasets) is a very
good method to deal with imbalanced datasets.
References
1. Japkowicz, N., Stephen, S.: The Class Imbalance Problem: A Systematic Study. Intelligent
Data Analysis 6(5), 429-450 (2002)
2. Chawla, N., Japkowicz, N., Kolcz, A. (eds.): Learning from Imbalanced Data Sets, ACM
SIGKDD Explorations 6(1) (June 2004) (special issue)
3. Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-
sampling Technique. Journal of Artificial Intelligence Research 16, 321-357 (2002)
4. Japkowicz, N.: The Class Imbalance Problem: Significance and Strategies. In: Proceedings
of the 2000 International Conference on Artificial Intelligence: Special Track on Inductive
Learning, Las Vegas, Nevada (2000)
5. Kecman, V.: Learning & Soft Computing, Support Vector Machines, Neural Networks and
Fuzzy Logic Systems. MIT Press, Cambridge (2001)
6. Akbani, R., Kwek, S., Japkowicz, N.: Applying Support Vector Machines to Imbalanced
Datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004.
LNCS (LNAI), vol. 3201. Springer, Heidelberg (2004)
7. Wu, G., Chang, E.Y.: KBA: Kernel Boundary Alignment Considering Imbalanced Data
Distribution. IEEE Transactions on knowledge and data engineering (2005)
8. Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Vth
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD 1999), San Diego, USA, pp. 155-164 (1999)
9. Zadrozny, B., Elkan, C.: Learning and making decisions when costs and probabilities are
both unknown. In: VII ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2001), San Francisco, USA, pp. 204-213 (2001)
10. Merler, S., Furlanello, C., Larcher, B., Sboner, A.: Automatic model selection in costsensi-
tive boosting. Information Fusion 4(1), 3-10 (2003)
11. Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: van
Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 146-153. Springer,
Heidelberg (1997)
12. Zhang, J., Bloedorn, E., Rosen, L., Venese, D.: Learning rules from highly unbalanced
data sets. In: IVth IEEE International Conference on Data Mining (ICDM 2004), Brighton,
UK, pp. 571-574 (2004)
13. Berthold, M., Huber, K.P.: Constructing Fuzzy Graphs from Examples. Intelligent Data
Analysis 3, 37-53 (1999)
Search WWH ::




Custom Search