Is Branch Coverage a Good Measure of Testing Effectiveness? - Empirical Software Engineering and Verification

Information Technology Reference

In-Depth Information

for Eiffel, to gain insights on three questions: (1) the actual branch coverage achieved

by testing Eiffel classes with AutoTest, (2) whether the achieved branch coverage corre-

lates with the number of bugs found in the code, (3) whether branch coverage is a good

stopping criterion for random testing. Despite the popularity of both random testing and

branch coverage, there is little data available on the topic.

We t e s t e d

14

Eiffel classes using our fully automated random testing tool AutoTest

2520

30

6

for

hour long. For

each run, we recorded the exercised branches and faults detected over time. The main

results are:

hours. AutoTest tested each class in

runs with each run

- Random testing reaches

branch coverage on average.

- Different test runs with different seeds for the pseudo-random number generator of

the same class exercise almost the same branches, but detect different faults.

- At the beginning of the testing session, branch coverage and faults both increase

dramatically and are strongly correlated.

93%

90%

10

-

of all the exercised branches are exercised in the first

minutes. After

30

minutes, the branch coverage level increases slowly. After

minutes, branch cov-

erage further increases by only

4%

.

- Over

minutes.

- There is a weak correlation between number of faults found and coverage over the

2520

50%

of faults are detected after

30

hours of testing.

The main implication of these results is that branch coverage is an inadequate stopping

criterion for random testing. As AutoTest conveniently builds test suites randomly as

it tests the code, the branch coverage achieved at any point in time corresponds to the

branch coverage of the test suite built since the beginning of the testing session. Be-

cause there is a strong correlation between faults uncovered and branch coverage when

coverage increases, higher branch coverage implies uncovering more faults. However,

half of the faults can be further discovered with hardly any increase in coverage. This

confirms that branch coverage by itself is not in general a good indicator of the quality

of a test suite.

A package is available online 2 containing the source code of the AutoTest tool and

instructions to reproduce the experiment.

Section 2 describes the design of our experiment. Section 3 presents our results. We

discuss the results in Section 4 and the threats to validity in Section 5. We present related

work in Section 6 and conclude in Section 7.

2

Experiment Design

The experiment on which we base our results consists in running automated random

testing sessions of Eiffel classes. We first describe contract-based unit testing for object-

oriented (O-O) programs, then introduce AutoTest, and present the classes under test,

the testing time and the computing infrastructure.

2 http://se.inf.ethz.ch/people/wei/download/branch_coverage.zip

Empirical Software Engineering and Verification

Search WWH ::

Custom Search

Home