Information Technology Reference
In-Depth Information
Precision has been used to measure the correctness of the results, while the
completeness has been assessed by employing the recall measure. More precisely,
Precision (P) and Recall (R) are given by: ones:
P = #actual identified clones
R = #actual identified clones
#total actual clones
#total candidates clones ;
.
To assess whether the approach is effective (RQ3), we computed a version
of the F-measure where Precision and Recall have the same weight, namely
F 1 =2
P∗R
P + R .
6 Results and Threats to Validity
In this section we discuss the results we gathered by the application of the
approach on the different clone types, using different similarity thresholds for
the detection. First the three research questions are addressed, then a discussion
on how we handled the main threats to validity is presented.
Tabl e 3. Summary statistics of the results
Clone Type Threshold Precision Recall F 1
Type 1
N.A.
1.0
1.0 1.0
Type 2
0.7
0.6
0.9 0.7
Type 2
0.8
0.7
0.6 0.6
Type 3
0.7
0.6
0.8 0.7
Type 3
0.8
0.6
0.8 0.7
6.1 Correctness, Completeness and Effectiveness of the Results
Since the Tree Kernel based approach does not include any formatting detail in
its internal source code representation, Type 1 clones include no variability, and
thus no Similarity threshold is necessary. With this kind of clones, it is easy to
obtain 1.0 as F-Measure.
Regarding the other two types of clones, some modifications in the identifiers
(Type 2 and 3) and in statements (Type 3 only) have been performed. In these
cases, larger values of the threshold (e.g. 0 . 9) produce a small number of candi-
dates. As a consequence, the recall is low, since only code fragments which are
very similar are considered as clones. This effect is particularly evident for Type
3 clones, where no clones at all are detected. On the other hand, threshold values
like 0 . 7and0 . 8 lead to better performance. In particular, the value 0 . 7 seems to
improve completeness without affecting correctness, and is therefore preferable.
Such attained results are strongly comparable with those reported in [9] in terms
of all the three indicators we are considering, namely correctness, completeness
and effectiveness, thus confirming the validity of artificially generated data.
 
Search WWH ::




Custom Search