Property-Based Testing - The ProTest Project - Formal Methods for Components and Objects

Information Technology Reference

In-Depth Information

474 iterations (which took 9 hours to collect). In total 1085 tests were executed,

611 of which passed, and 474 of which failed. It is important to bear in mind

that this would have been virtually impossible if it was up to a human to collect

the traces, and that the diversity of the traces would have been substantially

less if the collection process had not been guided by the inference algorithm.

To provide some intuition of the process, the test executions generated for the

first 10 test iterations are shown in Figure 3, with the hypothesis model generated

for the 10th iteration shown at the bottom and the reference model shown on

the right. Failed tests are prefixed with a '-', and passed tests are prefixed with

a '+'. At the beginning, the process starts with the most general LTS, where

everything loops to the same state. The first tests are very short, because most

of them fail instantly. Nonetheless, these already form a model that has some

important commonalities with the reference model; any communication must

either start with a listen or an open .An open must be followed by a syn ack

before anything else can happen, and a send last ack returns to the initial

state.

As more tests are executed, and more possible and impossible sequences are

identified, the model is gradually refined. Every test that conflicts with the cur-

rent model leads to a new, improved model. Depending on the current hypothesis

model, QuickCheck will generate tests that exercise scenarios that were not envis-

aged. In Figure 3, the test listen, send last ack, listen, passive close

passes (does not cause a program or pre/post-condition failure), even though

it should not be possible according to the reference model (also in Figure 3).

To provide an intuition of the extent of the final set of traces that the tech-

nique ended up with, the final APTA is shown in Figure 4. Although far too

small to read the labels 5 , it does show that the iterative test-generation-and-

mode-refinement process was successful at collecting a broad range of program

executions.

The final model consists of 38 states, with 277 permitted and 157 forbidden

transitions. Structurally, the inferred machine is completely different from the

reference model. It is however important to bear in mind that the reference

model is only partial - it only describes the small portion of behaviour that is

expected , and does not account for what may happen if unexpected sequences of

events are produced. For this to be the case it would have to account for every

operation at every state. It is this more complete spectrum of behaviour that is

represented by the inferred model.

Although more representative of the actual TCP implementation, the in-

ferred model inevitably still contains inaccuracies. The trace-collection process is

hampered by the fact that it had to wait for timeouts. More traces would in-

evitably have led to a more accurate machine. Nonetheless, given that no infer-

ence algorithm has been empirically shown to outperform the EDSM algorithm

yet [17], the inferred model is the most accurate one possible with current state

machine inference techniques.

5 The PDF version of this paper does permit the user to zoom in to read the labels.

Formal Methods for Components and Objects

Search WWH ::

Custom Search

Home