Information Technology Reference
In-Depth Information
Table 8.3
An inconsistent
data set
Case Attributes
Decision
Temperature Headache
Nausea Flu
1
High
Ye s
No
Ye s
2
Very_high
No
No
Ye s
3
Very_high
Ye s
No
No
4
Normal
No
No
No
5
High
No
Ye s
Maybe
6
Normal
Ye s
Ye s
Maybe
7
High
Ye s
No
No
The LERS data mining system uses rough set approach to inconsistent data, i.e.,
it computes lower and upper approximations for all concepts before applying LEM1
or LEM2 algorithm. Let
X
be a concept. In general,
X
is not definable in
A
. However,
X
may be approximated by two definable sets in
A
, the first one is called a
lower
approximation
of
X
, denoted by
appr
(
X
)
and defined as follows
{[
x
]|
x
∈
U
,
[
x
]ↆ
X
}
.
The second set is called an
upper approximation
of
X
, denoted by
appr
(
X
)
and
defined as follows
∪{[
x
]|
x
∈
U
,
[
x
]∩
X
=∅}
.
For example, for the concept [(
Flu
,
yes
)] = {1, 2},
appr
(
{
,
}
)
1
2
={2},
and
appr
= {1, 2, 7}.
Rules induced from lower approximations are called
certain
, rules induced from
upper approximations are called
possible
.
Note that even though the data set from Table
8.3
is inconsistent, the attribute
Nausea
is still redundant (irrelevant), since
(
{
1
,
2
}
)
}
∗
={
}
∗
{
Temperature
,
Headache
Temperature
,
Headache
,
Nausea
={{
1
,
7
}
,
{
2
}
,
{
3
}
,
{
4
}
,
{
5
}
,
{
6
}}
.
The LERS system computes, for every concept, a pair of data sets, based on lower
and upper approximations to induce certain and possible rule sets, respectively. For
example, for the concept {1, 2}, certain rule sets are induced from the data set
presented in Table
8.4
and possible rule sets from Table
8.5
.
Obviously, the final rule set, certain or possible, is a union of rule sets induced for
all concepts, from data sets based on lower or upper approximations, respectively,
with all rules for SPECIAL values removed.
Search WWH ::
Custom Search