Information Technology Reference
In-Depth Information
decreases, although less dramatically. After
minutes, the branch coverage level only
increases slightly, but many faults are detected in that period.
We also calculated the correlation between branch coverage and normalized num-
ber of faults. It varies much from class to class,
30
and there seems to be no
common pattern among the tested classes as shown in Figure 9.
The implications of these results are twofold: (1) when coverage increases, faults
discovered increase as well, (2) when coverage stagnates, faults are still found. Thus in-
creasing the branch coverage clearly increases the number of faults found. It is however
clearly not sufficient to have a high value of the branch coverage to assess the quality
of a testing session.
The next section further elaborates on these findings as well as their limitations.
0 . 3
to
0 . 97
4
Discussion
The results of the previous section provide material for answering three questions:
- Is branch coverage a good stopping criterion for random testing?
- Is it a good measure of testing effectiveness?
- What are the unexercised branches?
4.1 Branch Coverage as Stopping Criterion for Random Testing
Since in general, random testing cannot achieve
branch coverage in finite time,
total branch coverage is not a feasible stopping criterion. In practice, the percentage
of code coverage is often used as an adequacy criterion: the higher the percentage,
the more adequate the testing [19]; and testing can be stopped if the generated test
suite reached a certain level of adequacy. In our experiments, after
100%
hour, the branch
coverage level hardly increases, so it will be unpractical to extend the testing time until
reaching full coverage. Instead, the only reasonable way to use branch coverage would
be to evaluate the expectation of finding new faults. As shown in the previous section,
the number of faults evolves closely with the coverage only in the first few minutes of
testing. On testing sessions longer than
1
minutes, the correlation degrades. In fact,
about 50% of the faults are found in a period where the branch coverage level hardly
increases any more. This means that branch coverage is not a good predictor for the
number of faults remaining to be found.
The correlation greatly varies from class to class. For some classes such as BI-
NARY SEARCH TREE, the correlation coefficient is
10
and the correlation is al-
most linear, but for others such as ARRAYED STACK the correlation is weak (
0 . 98
),
especially with longer testing sessions. This variation on the class under test reduces
the precision if branch coverage is used as a stopping criterion.
Random testing also detects different faults in different test runs while it exercises
almost the same branches. This confirms that multiple restarts drastically improves the
number of faults found [5]: to find as many faults as possible, a class should be random-
tested multiple times with different seeds, even if the same branches are exercised every
time.
Our conclusion is that branch coverage alone cannot be used as a stopping criterion
for random testing.
0 . 3
 
Search WWH ::




Custom Search