Information Technology Reference
In-Depth Information
8.2 Rule Induction Based on Feature Selection
We will discuss rule induction which belongs to
supervised learning
, i.e., we will
assume that all cases are preclassified by an expert. In the data set one of variables
is called a
decision
and the decision value is assigned by an expert to each case. A
very simple example of such a table is presented as Table
8.2
, in which attributes
are:
Temperature
,
Headache
,
Nausea
, and the decision is
Flu
. The set of all cases
labeled by the same decision value is called a
concept
. For Table
8.2
, case set {1, 2}
is a concept of all cases affected by flu (for each case from this set the corresponding
value of
Flu
is
yes
).
Note that in Table
8.2
the attribute
Nausea
is redundant (irrelevant). Remaining
two attributes (
Temperature
and
Headache
) distinguish all six cases. Let us make it
more precise using fundamental definitions of rough set theory [
26
-
28
]. Let
B
be
a nonempty subset of the set
A
of all attributes. Let
U
denote the set of all cases.
The indiscernibility relation
IND
(
B
)
is a relation on
U
defined for
x
,
y
∈
U
by
(
x
,
y
)
∈
IND
(
B
)
if and only if for both
x
and
y
the values for all attributes from
B
are identical.
The indiscernibility relation
IND
(
B
)
is an equivalence relation. Equivalence
classes of
IND
are called
elementary sets
of
B
. Any union of elementary sets
of
B
is called a
definable set
in
B
. For Table
8.2
, and
B
={
Temperature
,
Headache
},
elementary sets of
IND
(
B
)
are {1}, {2}, {3}, {4}, {5}, and {6}.
The family of all
B
-elementary sets is a partition on
U
. This set will be denoted
by
B
∗
, for example, in Table
8.2
,
(
B
)
}
∗
={{
{
Temperature
,
Headache
1
}
,
{
2
}
,
{
3
}
,
{
4
}
,
{
5
}
,
{
6
}}
.
depends on
B
if and only if
B
∗
≤{
}
∗
, i.e., for
For a decision
d
we say that
{
d
}
d
}
∗
such that
X
any elementary set
X
in
B
there exists a concept
C
from
{
d
ↆ
C
.A
global covering
(or
relative reduct
)of
depends
on
B
and
B
is minimal in
A
. The algorithm to compute a single global covering is
presented below.
{
d
}
is a subset
B
of
A
such that
{
d
}
Table 8.2
A consistent data
set
Case
Attributes
Decision
Temperature Headache
Nausea Flu
1
High
Ye s
No
Ye s
2
Very_high
No
No
Ye s
3
Very_high
Ye s
No
No
4
Normal
No
No
No
5
High
No
Ye s
Maybe
6
Normal
Ye s
Ye s
Maybe
Search WWH ::
Custom Search