Information Technology Reference
In-Depth Information
To apply attribute reduction of this chapter to real-world data sets, we notice the
following three points. Firstly, we need additional measures to select the best reducts
for applications, for example, minimizing the size of the reduct or the number of
the equivalence classes given by the reduct and so on [ 1 , 13 , 27 , 42 , 48 , 50 ]. Such
an optimization problem cannot be generally solved in polynomial time. Therefore,
there are heuristic methods computing one or a number of reducts which are near to
optimal [ 1 , 27 , 42 , 48 ]. It does not mean that the Boolean functions studied in this
chapter are useless for applications. They can be incorporated into heuristic methods.
Secondly, when data sets include numerical or continuous attribute values, the
approach of this chapter does not work well, because the order of values or the degree
of difference between values are not considered (except for criteria in DRSM). There
are two approaches to overcome the drawback. One is discretization [ 7 , 15 ] where
the domain of a numerical attribute is partitioned to lower number of values. After
discretization, we can apply attribute reduction to the data set without modification.
The other is to use a similarity relation [ 12 , 28 ] instead of the indiscernibility relation
or a fuzzy partition [ 12 , 22 , 27 ] instead of the equivalence classes and define exten-
sions of RSM. In that case, we can define structure-based reducts for the extended
RSMs in the same way as those of this chapter.
Thirdly, reducts could suffer from overfitting because of rigid definitions of their
preserving conditions. One technique to avoid overfitting is dynamic reducts [ 1 ],
where decision tables with object subsets of a given cardinality are randomly and
repeatedly selected, and reducts which appear in more decision tables than a given
threshold are chosen as dynamic reducts.
In this chapter, we did not discuss algorithms to compute reducts and numerical
experiments, whereas they are found in [ 5 , 6 , 8 , 11 , 13 , 17 , 19 , 34 , 41 , 50 , 51 ].
The references show how to select a desirable reduct or find an optimal reduct, and
how to use the selected reduct for building classifiers. Additionally, they also show
experimental results for benchmark or real-world data sets. The references do not
include some types of reducts of this chapter, especially most types of reducts in
VPRSM, however, from their results we hope that the proposed reducts would be
useful in applications.
Proofs of theoretical results of this chapter are not so difficult. Parts of proofs are
found in our papers [ 20 , 23 , 25 , 31 ].
References
1. Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in
classification problem. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and
Applications, pp. 49-88. Physica-Verlag, New York (2000)
2. Ben-David, A.: Monotonicity maintenance in information-theoretic machine learning algo-
rithms. Mach. Learn. 19 , 29-43 (1995)
3. Ben-David, A., Sterling, L., Pao, Y.H.: Learning and classification of monotonic ordinal con-
cepts. Comput. Intell. 5 (1), 45-49 (1989)
 
Search WWH ::




Custom Search