Database Reference
In-Depth Information
correct prediction rate is achieved in testing data set (the lowest has 43.51%, the
highest has 58.25%). Another interesting point is that the attempt to improve the
accurate prediction in the way of equal-distributed target-value samples does not lead
much change; there is only roughly 3% improvement over the final result. The error
rates from using multiple supports are higher and the number of extracted rules is
lower than those from using single support mining engine.
The continuous time values result better than manually discretized values. This
indicates that the discretized values may have resulted in some information loss.
Table 1. CBA Mining Results Summary. Rules are ranked by confidence.
Error rate (%)
Time cost (seconds)
#Rules
Training
Testing
Training
Testing
Case1-SS-D
15
46.16
52.94
1.00
0.08
Case1-SS-C
10
45.180
47.56
1.01
0.07
Case1-MS-D
11
47.059
47.49
1.01
0.10
Case1-MS-C
9
45.180
47.56
1.04
0.09
59.95
Case2-SS-D
41
57.04
0.41
1.1
Case2-SS-C
18
58.09
0.44
1.3
57.39
Case2-MS-D
21
59.10
58.25
0.44
1.0
Case2-MS-C
12
58.45
58.91
0.45
1.2
Case3-SS-D
20
43.61
44.5
2.2
2.0
Case3-SS-C
15
43.5
43.8
2.2
2.0
Case3-MS-D
15
46.5
45.1
1.6
1.9
Case3-MS-C
15
46.5
46.9
1.6
1.6
10-CV-SS-D
22
50.5
52.5
25
10-CV-SS-C
18
46.05
46.89
25.4
10-CV-MS-D
17
48.87
49.1
28.9
10-CV-MS-C
16
45.98
25.3
45.02
Case4-SS-D
15
46.16
N/A
0.60
N/A
Case4-SS-C
10
45.180
N/A
0.66
N/A
Case4-MS-D
11
47.059
N/A
0.77
N/A
Case4-MS-C
9
45.180
N/A
1.04
N/A
There is no rule that has confidence value larger than 80%, however they do
describe some characters of the PR fixing process. Therefore they are useful for the
project management in estimating bug fixing related time issues.
Followings are examples of generated classification rules with CBA:
Rule 1: If severity= non-critical and Time-to-fix = 3 to 30 days and priority= medium
Then class = doc-bug. Confidence = 82.7%, Support = 2.7%
Search WWH ::




Custom Search