Databases Reference
In-Depth Information
G := B
−∪ T∈T [ T ];
end
{
while
}
;
for each T
∈T
do
if S∈T −{T }
[ S ]= B then
T
:=
T−{
T
}
;
end
{
procedure
}
.
denotes the cardinality of X .
MLEM2, a modified version of LEM2, processes numerical attributes dif-
ferently than symbolic attributes. For numerical attributes MLEM2 sorts all
values of a numerical attribute. Then it computes cutpoints as averages for
any two consecutive values of the sorted list. For each cutpoint q MLEM2
creates two blocks, the first block contains all cases for which values of the
numerical attribute are smaller than q , the second block contains remaining
cases, i.e., all cases for which values of the numerical attribute are larger than
q . The search space of MLEM2 is the set of all blocks computed this way,
together with blocks defined by symbolic attributes. Starting from that point,
rule induction in MLEM2 is conducted the same way as in LEM2.
Additionally, the newest version of MLEM2, with merging intervals, at
the very end simplifies rules by, as its name indicates, merging intervals for
numerical attributes.
For a set X ,
|
X
|
4 Classification System
Rules induced from raw, training data are used for classification of unseen,
testing data. The classification system of LERS is a modification of the bucket
brigade algorithm [1, 12]. The decision to which concept a case belongs to is
made on the basis of three factors: strength, specificity, and support. They are
defined as follows: Strength is the total number of cases correctly classified by
the rule during training. Specificity is the total number of attribute-value pairs
on the left-hand side of the rule. The matching rules with a larger number of
attribute-value pairs are considered more specific. The third factor, support ,
is defined as follows
Strength factor ( R )
Specificity factor ( R ) .
matching rules R describing C
The concept C for which the support is the largest is a winner and the
case is classified as being a member of C .
In the classification system of LERS, if complete matching is impossible,
all partially matching rules are identified. These are rules with at least one
attribute-value pair matching the corresponding attribute-value pair of a case.
For any partially matching rule R , the additional factor, called Matching
factor ( R ), is computed. Matching factor ( R ) is defined as the ratio of the
number of matched attribute-value pairs of R with a case to the total number
 
Search WWH ::




Custom Search