Databases Reference
In-Depth Information
CARs for each class are mined, based on the existence of 50 potential signif-
icant CARs for each class in
), the average accuracy was found as 76.79%.
The second set of evaluations undertaken used a confidence threshold
value of 50%, a set of decreasing support threshold values from 1 to 0.03%,
and the letter recognition dataset. The “large” letter recognition dataset
( letRecog.D106.N20000.C26 ), comprises 20,000 records and 26 pre-defined
classes. For the experiment the dataset has been discretised and normalised
into 106 binary categories. From the experiment it can be seen that a re-
lationship exists between: the selected value of support threshold ( σ or
min.support ), the number of generated CARs (—
R
—), the accuracy of clas-
sification ( Accy ), and the time in seconds spent on computation ( Time ).
Clearly, ↓ σ ⇒↑|R|⇒ ( ↑ Accy ∧↑Time ).
Table 2 demonstrate that with a 50% confidence threshold and a value
of 1 as the value for k (only the most significantly CAR for each class is
mined in
R
), the proposed rule mining approach (its randomised fashion)
performs well with respect to both accuracy of classification and e ciency
of computation. When applying the “one-by-one” rule mining approach, as σ
decreasing from 1 to 0.03%,
|R|
(before mining the “best k ” rules) is increased
|R|
(after mining the “best k ” rules and re-ordering
all rules) is increased from 167 to 6,367. Consequently accuracy has been
increased from 29.41 to 48.22%, and Time (the time spent on mining the
k significant rules) has been increased from 0.08 to 12.339 s. In comparison
when applying the proposed randomised rule mining approach with a value of
50 as the value for k (there exist 50 potential significant rules for each class in
|R|
from 149 to 6,341; and
|R|
(before mining the “best k ” rules)
), as σ decreasing from 1 to 0.03%,
|R|
(after mining the “best k ” rules and
is increased from 149 to 6,341; and
|R|
Tabl e 2 . Computational e ciency and classification accuracy ( α = 50%)
Dataset
One-by-one approach
Randomised selector
k =1
k =1, k =50
letRecog
D106.
Rule
Rule
Time
Accuracy
Rule
Rule
Time Accuracy
N20000.
number number
(s)
(%)
number number
(s)
(%)
C26
(before)
(after)
(before)
(after)
1
149
167
0 . 080
29.41
149
166
0.160
29.60
0.75
194
212
0 . 110
29.94
194
211
0.160
29.92
0.50
391
415
0 . 200
35.67
391
411
0.251
35.78
0.25
1118
1143
1 . 052
40.36
1118
1139
0.641
41.26
0.10
2992
3018
4 . 186
44.95
2992
3016
0.722
45.18
0.09
3258
3284
4 . 617
45.21
3258
3282
1.913
45.42
0.08
3630
3656
6 . 330
45.88
3630
3655
2.183
45.43
0.07
3630
3656
6 . 360
45.88
3630
3656
2.163
46.02
0.06
4366
4392
5 . 669
46.70
4366
4391
2.754
46.45
0.05
4897
4923
7 . 461
47.28
4897
4922
3.235
47.65
0.04
5516
5542
9 . 745
47.67
5516
5542
3.526
47.53
0.03
6341
6367
12 . 339
48.22
6341
6365
4.296
48.79
 
Search WWH ::




Custom Search