Information Technology Reference
In-Depth Information
8.5 LERS Classification System
There is a few existing classification systems, e.g., associated with rule induction
systems LERS or AQ [ 25 ]. A classification system used in LERS is a modification
of the well-known bucket brigade algorithm [ 3 , 19 , 30 ]. In the LERS classifica-
tion system the decision to which concept a case belongs is made on the basis of
three factors: strength , specificity , and support . These factors are defined as follows:
strength is the total number of cases correctly classified by the rule during training.
Specificity is the total number of attribute-value pairs on the left-hand side of the
rule. The matching rules with a larger number of attribute-value pairs are considered
more specific. The third factor, support , is defined as the sum of products of strength
and specificity for all matching rules indicating the same concept. The concept C for
which the support, i.e., the following expression
(
)
(
)
Strength
r
Specificity
r
matching rules r describing C
is the largest is the winner and the case is classified as being a member of C .
In the classification system of LERS, if complete matching is impossible, all
partially matching rules are identified. These are rules with at least one attribute-
value pair matching the corresponding attribute-value pair of a case. For any par-
tially matching rule r , the additional factor, called Matching _ factor ( r ), is computed.
Matching_factor( r ) is defined as the ratio of the number of matched attribute-value
pairs of r with a case to the total number of attribute-value pairs of r . In partial
matching, the concept C for which the following expression is the largest
Matching _ factor
(
r
)
Strength
(
r
)
Specificity
(
r
)
partially matching
rules r describing C
is the winner and the case is classified as being a member of C .
8.6 Experiments
In our experiments we used 14 data sets that are available on the Machine Learning
Repository at the University of California at Irvine, see Table 8.6 . Some of these data
sets were incomplete ( Breast Cancer-Slovenia , Soybean , Postoperative Patient and
Primary Tumor ).
For incomplete data sets missing attribute values were replaced by specified
attribute values using an imputation method called the most common value of an
attribute restricted to a concept [ 16 ].
 
Search WWH ::




Custom Search