Databases Reference
In-Depth Information
R
is presented using CSA ordering, and N represents the size of
R
that can
be as large as “2 n
1”, a rule weighting scheme is proposed in this chapter,
which assigns a score to a CAR R j ∈R
n
that represents how significantly R j
contributes to a single class c i
C . An alternative rule ordering mechanism is
consequently introduced, based on the proposed rule weighting scheme, that
aims to improve the performance of the well-established CSA ordering regard-
ing the accuracy of classification. In [19] a general framework for selectors,
namely ( k,m,n )-selectors, was proposed with applications in optimal group
testing. In this chapter, a similar concept of selectors is further considered in
a randomised setting. This randomised selector can be proved to exist with a
high probability. With regards to the concept of selectors, a novel rule mining
approach is presented that addresses the problem of mining the k “significant
rules” (see Definition 9 in Sect. 3.2) in
. The rule mining approach operates
in time O ( k 2 n 2 ) in its deterministic fashion, and O ( kn ) in its randomised fash-
ion, where k represents the number of CARs in each class that can potentially
be used to distinguish between classes and k
R
k ; as opposed to exponential
time O (2 n ) - the time required in score computation to find all k significant
(the “best k ”) rules in
in a “one-by-one” manner. The experimental results
show that the proposed rule weighting and rule ordering approaches perform
well regarding the accuracy of classification, and evidence the fast computa-
tional e ciency of running the randomised rule mining approach. Note that
the deterministic rule mining approach is theoretical only.
R
1.2 Chapter Organisation
The following section describes some Data Mining work that relate to CARM.
In Sect. 3 we first introduce the rule weighting scheme together with a rule
ordering mechanism based on the rule weighting scheme; and sketch the con-
cept of deterministic selectors and give an introduction to the concept of
randomised selectors. In Sect. 4 we propose a rule mining approach to e -
ciently mine the “best
k ”CARsin
. In Sect. 5 we present experimental
results obtained using the TFPC (Total From Partial Classification) CARM
algorithm [15, 18] coupled with a “best first” case satisfaction approach. Fi-
nally we discuss our conclusions in Sect. 6, and a number of open issues for
further research.
R
2 Related Work
2.1 Association Rule Mining
ARM extracts a set of ARs from D T , first introduced in [1]. Let I =
{
a 1 ,a 2 ,...,a n− 1 ,a n }
be a set of items (data-attributes), and
T
=
{
T 1 ,T 2 ,...,
T m− 1 ,T m }
be a set of transactions (data-records), D T is described by
T
,where
Search WWH ::




Custom Search