Information Technology Reference
In-Depth Information
information-preserving reduct with respect to this dependency. Couple of efficient,
linear-time algorithms for computing single attribute reducts, either in classification
tables or probabilistic decision tables, are presented. The ability to compute reducts
allows us also to determine the importance, or significance of attributes. This is the
subject of Sect. 6.7 . Finally, in Sect. 6.8 , we discuss the concept of generalized core
attributes, the extension of the original core attributes introduced by Pawlak [ 10 , 11 ].
The core attributes are the fundamental ones, which are preserved in every attribute
reduction.
6.2 Variable Precision Rough Sets
In the rough set approach to data analysis, the crucial aspect is the existence of
an ability, or knowledge, to form the prior classification of the universe of objects
of interest into distinct classes. This ability, or classification knowledge , is usually
associated with an external agent, such as medical professional for example, who
is assumed to know how to classify objects (for example patients) into categories
(for example, into health condition groups). However, in automated systems such
an expert typically is not available. Instead, the system has to rely on measurements
taken by system sensors (for example, temperature, blood pressure etc.) to perform
the classification. In the rough set approach, the measurements are converted into
discrete features called attribute values , which are then used to classify objects. We
elaborate in detail about the attribute value-based classifications in Sect. 6.4 .
The general variable precision rough set (VPRS) model does not make any
assumptions how the prior classification was performed. It just assumes that some
kind of prior knowledge exists and is represented in mathematical form by an equiv-
alence relation, referred to as an indiscernibility relation IND on the universe U , IND
U . The relation is assumed to have a finite number of equivalence classes, i.e.
classification categories, called elementary sets. It should be noted that the assump-
tion of finite number of classes may not be satisfied in general, but in attribute-value
systems, which are the focus of this chapter, it is always the case. The collection of
elementary sets of the IND relation will be denoted as IND . The pair ( U, IND )is
called an approximation space.
Let X be an arbitrary subset, referred to as the target set , of the universe U , X
U
×
U .
In practice, the universe is a finite non-empty collection of objects of interest, such
as medical patients, and the target set is our “goal” class, for example representing
the class of patients suffering from a specific disease. Our objective is to create a
system which would allow us to classify arbitrary objects into the “goal” class, or its
complement, with an error rate which we would consider acceptable in the context
of our criteria (which are domain-specific and, consequently, outside of the rough set
model), but lower, on average, than in the case of random classification. For exam-
ple, the objective may be to predict (diagnose) the presence, or absence, of a specific
disease based on the results of medical tests, which are supposed to increase the accu-
racy of such predictions (if tests are properly designed) in comparison to predictions
based solely on the frequency of occurrence of the disease in the population.
 
Search WWH ::




Custom Search