Information Technology Reference
In-Depth Information
10.1 Introduction
Real world datasets often contain attributes that are irrelevant or redundant for the
classification problem at hand. These features can degrade the performance and
interfere with the learning mechanism typically resulting in a reduction in the quality
and generality of the discovered patterns/model and overfitting of the model to the
train data. The basic principle of feature subset selection is to find the necessary
and sufficient subset of features or attributes which results in simplification of the
discovered knowledge model, better generalisation power, while at the same time
the accuracy for classification tasks is not compromised.
Association rule mining, being one of the most popular techniques for discovering
interesting associations among data objects, has also been utilized for the classi-
fication task, where it can contribute to discovering strong associations between
occurring attribute and class values [ 26 ]. An associative classification framework
was first proposed in [ 28 ], which consists of an algorithm to generate all class asso-
ciation rules fromwhich a classifier is constructed. Many works [ 10 , 45 , 49 , 50 ]have
developed various extensions and refinements to this initially proposed framework
and the results reported high accuracy and efficiency for the classification prob-
lem. Similarly in tree-structured data domain, the XRules structural classifier [ 52 ],
is based on association rules generated from the ordered embedded subtree mining
algorithm [ 51 ].
When dealing with pattern selection, one faces the quantity problem due to large
volume of output as well as the quality assurance problem of rules reflecting real,
significant associations in the domain under investigation [ 25 ]. In a recent work pre-
sented in [ 24 ] the search space of Apriori-like algorithms is pruned so that discovered
rules are interesting with respect to the Jaccard measure, rather than the support con-
straint for which an optimal threshold is often unknown. To deal with the quality
problem many interestingness measures have been developed and utilized in various
knowledge discovery tasks [ 12 , 29 ]. In one train of thought, since the nature of data
mining techniques is data-driven, the generated rules can often be effectively vali-
dated by a statistical methodology in order for them to be useful in practice [ 13 , 22 ].
Interesting rules could then be interpreted as those rules that have a sound statistical
basis and are neither redundant nor contradictory. The aforementioned works [ 12 , 13 ,
22 , 29 ] have mainly focused on relational data. There is relatively less work in this
area when it comes to tree-structured data (an overview is given in the next section).
Tree-structured data has underlying complex structural characteristics which typi-
cally need to be preserved in the knowledge patterns discovered during a data mining
task [ 17 , 52 ]. The structural characteristics of data pose difficulties in application
of traditional classifiers and interestingness measures, whose mechanism typically
does not take structural aspects of data into account.
In [ 38 ], a unified framework was proposed that systematically combines several
statistical/heuristic techniques to assess the rule quality and remove any redundant
and unnecessary rules for the classification problem. In this chapter, the focus is on
 
Search WWH ::




Custom Search