Information Technology Reference
In-Depth Information
Table 8.3 An inconsistent
data set
Case Attributes
Decision
Temperature Headache
Nausea Flu
1
High
Ye s
No
Ye s
2
Very_high
No
No
Ye s
3
Very_high
Ye s
No
No
4
Normal
No
No
No
5
High
No
Ye s
Maybe
6
Normal
Ye s
Ye s
Maybe
7
High
Ye s
No
No
The LERS data mining system uses rough set approach to inconsistent data, i.e.,
it computes lower and upper approximations for all concepts before applying LEM1
or LEM2 algorithm. Let X be a concept. In general, X is not definable in A . However,
X may be approximated by two definable sets in A , the first one is called a lower
approximation of X , denoted by appr
(
X
)
and defined as follows
{[
x
]|
x
U
, [
x
]ↆ
X
} .
The second set is called an upper approximation of X , denoted by appr
(
X
)
and
defined as follows
∪{[
x
]|
x
U
, [
x
]∩
X
=∅} .
For example, for the concept [( Flu , yes )] = {1, 2},
appr
( {
,
} )
1
2
={2},
and
appr
= {1, 2, 7}.
Rules induced from lower approximations are called certain , rules induced from
upper approximations are called possible .
Note that even though the data set from Table 8.3 is inconsistent, the attribute
Nausea is still redundant (irrelevant), since
( {
1
,
2
} )
} ={
}
{
Temperature
,
Headache
Temperature
,
Headache
,
Nausea
={{
1
,
7
} , {
2
} , {
3
} , {
4
} , {
5
} , {
6
}} .
The LERS system computes, for every concept, a pair of data sets, based on lower
and upper approximations to induce certain and possible rule sets, respectively. For
example, for the concept {1, 2}, certain rule sets are induced from the data set
presented in Table 8.4 and possible rule sets from Table 8.5 .
Obviously, the final rule set, certain or possible, is a union of rule sets induced for
all concepts, from data sets based on lower or upper approximations, respectively,
with all rules for SPECIAL values removed.
 
Search WWH ::




Custom Search