Databases Reference
In-Depth Information
simple modification results in an algorithm that generates an approximately
equal number of rules from each class irrespective of whether it is a majority
class or not.
The rest of this chapter is organized as follows. Our proposed classifier,
which we refer to as the ARM-KNN-BF classifier , is discussed in Sect. 2; a
primer on belief theory and the strategy we employ to accommodate highly
skewed databases are also discussed in Sect. 2. Section 3 presents the experi-
mental results. Conclusion, which includes several interesting research direc-
tions, appears in Sect. 4.
2 The Proposed ARM-KNN-BF Classifier
Although ARM in its original form can be deployed for extracting rules from
large databases based on minimum support and minimum confidence condi-
tions [1], it does not effectively address all the requirements (C1-C3) stated
in Sect. 1. For example, one may develop a classifier based on rules that are
generated by simply ignoring all the training data instances possessing class
label ambiguities. But this strategy can potentially exclude a large portion
of the training data instances that would have otherwise provided extremely
crucial information. Moreover, since the training data set is highly skewed,
a classifier built on it tends to favor the majority classes at the expense of
the minority classes. Avoidance of this scenario is of paramount importance
since this could result in devastating consequences in a threat classification
environment.
As mentioned previously, we use belief theoretic notions to address (C1).
One could alleviate the computational and storage burdens (C2) as well as
the problems due to skewness of the database (C3) significantly by using a
coherent set of rules in the classifier that effectively captures the re-occurring
patterns in the database [31]. An effective ARM mechanism, as demonstrated
in [18], can produce such a set of rules.
Each stage of the proposed algorithm can be summarized as follows: The
training phase consists of partitioned ARM, rule pruning and rule refinement.
The partitioned ARM mechanism generates an approximately equal number of
rules in each class. The rule pruning and refinement processes use the training
data set to select the important rules. The Dempster-Shafer belief theoretic
notions [24] are utilized in the classification stage where a classifier that is
capable of taking certain types of ambiguities into account when classifying
an unknown instance has been introduced.
2.1 Belief Theory: An Introduction
Let Θ =
be a finite set of mutually exclusive and exhaustive
'hypotheses' about the problem domain. It signifies the corresponding 'scope
{
θ 1 2 ,...,θn
}
Search WWH ::




Custom Search