Information Technology Reference
In-Depth Information
Algorithm 1.
Clone Injection Algorithm
Require:
sourceCode
: Source code of the system under analysis;
Require:
type
: Type of the clones to generate;
Require:
probCloning
: (Constant) The probability of functions/methods to be cloned.
Ensure:
The source code of the system with randomly injected clones;
Ensure:
The tracking info of injected clones in the source code.
1:
function
InjectClones(
sourceCode
,
type
)
2:
functionList
←
parseAndExtractFunctionsFrom(
sourceCode
)
clonesTrackInfo
←∅
3:
for each:
function
∈
functionList
do
4:
probGenerateClone
← random
(0,1)
5:
6:
if
probGenerateClone
≤
probCloning
then
7:
nCopies
←
0
8:
dice
← random
(0
,
1)
9:
while not
(2
−
(
nCopies
+1)
≤
dice
≤
2
−nCopies
)
do
10:
nCopies
←
nCopies
+1
11:
end while
12:
for
i
=1to
nCopies
do
13:
newClone
←
Clone(
function
,
type
)
14:
trackInfo
←
Inject(
sourceCode
,
newClone
)
15: add(
clonesTrackInfo
,
trackInfo
)
16:
end for
17:
end if
18:
end for
19:
return
clonesTrackInfo
20:
end function
generated fraction of the total number of statements in the analyzed function.
In this way, we may avoid the generation of a totally different function which
will not be an actual clone of the target one.
Finally, in case of Type 4 clones, the mutation operations include the reorder-
ing of statements (Line 9) and the replacement of equivalent control structures
(Line 10). In particular the former is applied only to declaration and independent
statements, while the latter substitutes possibly occurring control structures with
other semantically equivalent. For instance,
for loops
may be replaced with
while loops
,aswellas
if
−
elseif
conditions substituted by
switch
−
case
structures.
5 Case Study
In the preceding Section we discussed the limits of the existing data sets for clone
detection and described how an artificial data set can be produced. Although
such data set aims at training, we used it also to assess the Tree Kernel technique
described in [9]. Even if we can not assume that the performance obtained on
artificial data will generalize to the real case, these experiments allow us to
obtain a better understanding of the force and weakness of a Kernel based clone
detection approach.
Search WWH ::
Custom Search