Databases Reference
In-Depth Information
help of several domain experts who would classify the feature vectors (of the
instances) into different threat level classes. In such a situation, it is likely
that the domain experts would arrive at conflicting threat levels, which es-
sentially introduces ambiguities into the class labels of the instances in the
training data set. In addition, although the number of training data instances
that are classified as having a heightened threat level would likely be very
small, identification of targets possessing a heightened threat level would be
of critical importance. For example, suppose the threat classes for an airport
terminal security monitoring system are the following:
{ NotDangerous , OfConcern , Dangerous , ExtremelyDangerous }
. (1)
In the training data set, one is likely to encounter a larger number of instances
labeled as NotDangerous and very few labeled as ExtremelyDangerous .The
classification results then may be biased toward the majority class.
In essence, a classifier for such a scenario needs to effectively address the
following characteristics:
(C1) The training data set may contain ambiguities in the class labels due
to the conflicting conclusions made by different domain experts.
(C2) The computational and storage requirements should be tolerable so
that classification can be carried out in real-time.
(C3) The threat class distribution in the training data set can be highly
skewed.
In this chapter, a classifier that can effectively take into consideration the
above characteristics typical of a threat detection and assessment scenario is
proposed [31]. To address (C1), several different and effective approaches are
available, for example, rough set theory [29,30] and belief theory. The relation-
ship between belief theory and other mechanisms can be found on [8,15,17,26].
In our proposed classifier, belief theoretic notions are adopted. This is mainly
motivated by the fact that belief theory provides an easy and convenient way
for handling ambiguities. A classifier facilitated with belief theoretic notions
can improve the overall classification accuracy while providing a quantitative
'confidence interval' on the classification results.
To address (C2), the classifier is developed to operate on a rule set ex-
tracted by an ARM algorithm that has been appropriately modified to han-
dle class label ambiguities. This rule set is significantly smaller than the size
of the original database. This is the main difference between our proposed
classifier and the KNN-BF classifier in [4]. ARM has demonstrated its ca-
pability of discovering interesting and useful co-occurring associations among
data in large databases [1,14,19,25]. In the classifier mentioned in [18], it uses
a modified ARM method to extract the association rules. However, it does
not effectively address (C2).
To address (C3), the proposed ARM algorithm is applied to different par-
titions of the database where the partitioning is based on the class labels. This
Search WWH ::




Custom Search