Information Technology Reference
In-Depth Information
initial preprocessing via a discretization procedure to make it applicable to rough
set methodology. This pre-processing however leads to a loss of information and
introduces a subjective factor into the method.
The variable precision and Bayesian rough set models are focused on the recogni-
tion andmodelling of set overlap-based, also referred to as probabilistic, relationships
between sets, which are most useful when dealing with noisy data. In this approach,
the set-overlap relationships are used to construct approximations of undefinable sets
[ 11 ]. The primary application of the approach is to the analysis of data co-occurrence-
based dependencies in classification tables and probabilistic decision tables derived
from data, as discussed in the following sections. Both, the probabilistic decision
tables and classification tables are normally “learned” from data to represent some
inter-data item connections, typically for the purpose of their analysis or data value
prediction. The probabilistic decision tables can also be used as a basis of generalized
probabilistic rule induction algorithms [ 29 ], but this topic is outside the scope of this
chapter.
In practical applications of the data-acquired decision tables, one of the main
issues is the identification of a minimal subset of attributes, which are discrete func-
tions of measured features, to represent an identified data dependency without any
loss, or with minimal loss, of information. The original general idea of attribute
reduct, as introduced by Pawlak [ 10 , 11 ], is applicable here. However, the original
specific notion of reduct is applicable only to functional, or partial functional, data
dependencies. In this chapter, we discuss an extended notion of reduct, as defined
in the contexts of variable precision and Bayesian rough set models. The notion of
reduct in these contexts allows for information-preserving identification of minimal
subsets of attributes, in the presence of probabilistic dependencies between attributes.
The chapter is organized as follows. In the next section, we review the fundamen-
tals of the variable precision rough set approach, which include the introduction of set
approximations and the presentation of the basics of the related Bayesian rough set
model. In Sect. 6.3 , we discuss different kinds of probabilistic dependencies occur-
ring between a “target set” and a partition of the universe of interest. The partition
is assumed to represent our classification knowledge. The target set is our learning
goal, whose approximate classification in terms of the classification knowledge we
are trying to learn. The dependencies in question reflect our overall ability to cre-
ate such a classification. In Sect. 6.4 , the probabilistic attribute value-based decision
tables are introduced, along with related classification tables. Both kinds of these
tables represent our classification knowledge with respect to the target set.
The probabilistic decision tables additionally represent rough approximations of
the target set, as defined in the framework of the variable precision rough set theory.
The inter-attribute dependencies occurring in both, the probabilistic decision tables
and classification tables, are subject of Sect. 6.5 . All the discussed dependencies are
of probabilistic nature and are either defined in the contexts of variable precision or
Bayesian rough set models. They generalize and expand the attribute dependencies
introduced by Pawlak in the original rough set theory [ 11 ]. Attribute reduction with
respect to introduced dependencies is a subject of Sect. 6.6 . The monotonicity prop-
erty of the introduced
ʻ
dependency measure allows for a definition of the notion of
Search WWH ::




Custom Search