Information Technology Reference
In-Depth Information
Algorithm 1. Clone Injection Algorithm
Require: sourceCode : Source code of the system under analysis;
Require: type : Type of the clones to generate;
Require: probCloning : (Constant) The probability of functions/methods to be cloned.
Ensure: The source code of the system with randomly injected clones;
Ensure: The tracking info of injected clones in the source code.
1: function InjectClones( sourceCode , type )
2:
functionList parseAndExtractFunctionsFrom( sourceCode )
clonesTrackInfo ←∅
3:
for each: function functionList do
4:
probGenerateClone ← random (0,1)
5:
6:
if probGenerateClone probCloning
then
7: nCopies 0
8: dice ← random (0 , 1)
9: while not (2 ( nCopies +1) dice 2 −nCopies ) do
10: nCopies nCopies +1
11: end while
12: for i =1to nCopies do
13: newClone Clone( function , type )
14: trackInfo Inject( sourceCode , newClone )
15: add( clonesTrackInfo , trackInfo )
16: end for
17: end if
18: end for
19: return clonesTrackInfo
20: end function
generated fraction of the total number of statements in the analyzed function.
In this way, we may avoid the generation of a totally different function which
will not be an actual clone of the target one.
Finally, in case of Type 4 clones, the mutation operations include the reorder-
ing of statements (Line 9) and the replacement of equivalent control structures
(Line 10). In particular the former is applied only to declaration and independent
statements, while the latter substitutes possibly occurring control structures with
other semantically equivalent. For instance, for loops may be replaced with
while loops ,aswellas if elseif conditions substituted by switch case
structures.
5 Case Study
In the preceding Section we discussed the limits of the existing data sets for clone
detection and described how an artificial data set can be produced. Although
such data set aims at training, we used it also to assess the Tree Kernel technique
described in [9]. Even if we can not assume that the performance obtained on
artificial data will generalize to the real case, these experiments allow us to
obtain a better understanding of the force and weakness of a Kernel based clone
detection approach.
 
Search WWH ::




Custom Search