Is Branch Coverage a Good Measure of Testing Effectiveness? - Empirical Software Engineering and Verification

Information Technology Reference

In-Depth Information

For Faults before branch , the faults should either be fixed first or avoided while

testing. For the Implementation limitation and Concurrent context needed categories,

we need to further improve AutoTest.

5

Threats to Validity

Four observations may raise questions about the result.

Representativeness of chosen classes. Despite being chosen from the widely used Eif-

fel library EiffelBase and varying in terms of various code metrics and intended seman-

tics, the chosen classes may not be fully representative of general O-O programs.

3 Representativeness of AutoTest's variant of random testing. We tried to keep the

algorithm of AutoTest as general as possible, but other implementations of random

testing may produce different results.

100%

Branch coverage below

. We do not know whether the correlation between

branch coverage and number of faults still holds when all branches are exercised. We

consider this very likely, since if we considered the application trimmed of all the

branches that were not visited, we would then achieve 100% branch coverage in most

cases.

Size of test suite. A recent formal analysis [3] of random testing showed that the num-

ber of tests made has a great influence on the results found with random testing. It might

be possible that while our study relies on many more tests than previous ones, we did

not execute enough tests. We consider this unlikely because of the high similarity of the

faults found in the present experiments.

6

Related Work

Intuitively, random testing cannot compete in terms of effectiveness with systematic

testing because it is less likely that randomly selected inputs will be interesting enough

to reveal faults in the program under test. Some studies [17,16] have shown that random

testing is as effective as some systematic methods such as partition testing. Our results

also showed that random testing is effective: in the experiment, random testing detected

328

14

3

28

faults in

classes in EiffelBase library while in the past

years, only

faults

were reported by users.

Ciupa et al. [5] investigated the predictability of random testing and showed that in

terms of the number of faults detected over time, random testing is predictable. Figure 5

and Figure 6 confirm those results.

Many studies compare branch coverage for assessing the effectiveness of test strate-

gies. With other criteria in. Frankl et al. [7] compared the branch coverage criterion

with the all-uses criterion and concluded that for their programs, all-uses adequate test

sets performs better than branch adequate test sets, and branch adequate test sets do not

perform significantly better than null-adequate test sets, which are test sets containing

randomly selected test cases without any adequacy requirement. The present study fo-

cuses more on the branch coverage level achieved by random testing in a certain amount

of time and the number of faults found in that period.

Empirical Software Engineering and Verification

Search WWH ::

Custom Search

Home