Databases Reference
In-Depth Information
R
is presented using CSA ordering, and
N
represents the size of
R
that can
be as large as “2
n
1”, a rule weighting scheme is proposed in this chapter,
which assigns a score to a CAR
R
j
∈R
−
n
−
that represents how significantly
R
j
contributes to a single class
c
i
∈
C
. An alternative rule ordering mechanism is
consequently introduced, based on the proposed rule weighting scheme, that
aims to improve the performance of the well-established CSA ordering regard-
ing the accuracy of classification. In [19] a general framework for selectors,
namely (
k,m,n
)-selectors, was proposed with applications in optimal group
testing. In this chapter, a similar concept of selectors is further considered in
a randomised setting. This randomised selector can be proved to exist with a
high probability. With regards to the concept of selectors, a novel rule mining
approach is presented that addresses the problem of mining the
k
“significant
rules” (see Definition 9 in Sect. 3.2) in
. The rule mining approach operates
in time
O
(
k
2
n
2
) in its deterministic fashion, and
O
(
kn
) in its randomised fash-
ion, where
k
represents the number of CARs in each class that can potentially
be used to distinguish between classes and
k
R
k
; as opposed to exponential
time
O
(2
n
) - the time required in score computation to find all
k
significant
(the “best
k
”) rules in
≥
in a “one-by-one” manner. The experimental results
show that the proposed rule weighting and rule ordering approaches perform
well regarding the accuracy of classification, and evidence the fast computa-
tional e
ciency of running the randomised rule mining approach. Note that
the deterministic rule mining approach is theoretical only.
R
1.2 Chapter Organisation
The following section describes some Data Mining work that relate to CARM.
In Sect. 3 we first introduce the rule weighting scheme together with a rule
ordering mechanism based on the rule weighting scheme; and sketch the con-
cept of deterministic selectors and give an introduction to the concept of
randomised selectors. In Sect. 4 we propose a rule mining approach to e
-
ciently mine the “best
k
”CARsin
. In Sect. 5 we present experimental
results obtained using the TFPC (Total From Partial Classification) CARM
algorithm [15, 18] coupled with a “best first” case satisfaction approach. Fi-
nally we discuss our conclusions in Sect. 6, and a number of open issues for
further research.
R
2 Related Work
2.1 Association Rule Mining
ARM extracts a set of ARs from
D
T
, first introduced in [1]. Let
I
=
{
a
1
,a
2
,...,a
n−
1
,a
n
}
be a set of items (data-attributes), and
T
=
{
T
1
,T
2
,...,
T
m−
1
,T
m
}
be a set of transactions (data-records),
D
T
is described by
T
,where