Databases Reference
In-Depth Information
formed the basis for most state machine inference techniques in the domain
of software engineering [1, 11, 21{23, 31].
However, since then a host of advances have been made in the field of
grammar inference that have not been exploited in the software engineering
community [36]. The rest of this section will show how these can be applied to
solve three of the key problems with the k-tails technique that were outlined
in the previous section.
3.3.1 Negative Samples
As stated in Section 3.2.3 problem 1, the k-tails technique assumes that all
traces are valid. This means that it is never possible to obtain a characteristic
sample of traces (which requires negative samples { see the discussion of prob-
lem 3), making it impossible to ensure an accurate state machine. It is only
by providing a large and diverse set of positive as well as negative examples,
that the inference of accurate (or at least reasonably accurate) state machines
becomes a possibility. So what is meant by a \negative" program trace? If
the state machine inference process is akin to that shown in Figure 3.2, where
should these negative traces come from? Some answers to these questions are
provided below.
The usual assumption is that a trace must be by denition positive { it
represents an observed, valid program execution. This assumption is however
not necessarily correct. There are two circumstances that can give rise to
\negative" program traces:
1. Program executions that fail: Depending on the nature of the soft-
ware (and the choice of semantics attributed to program traces), execu-
tions that end by throwing an exception, or terminate with an outright
failure can be deemed to produce negative traces.
2. Infeasible executions: For most software systems, the vast majority
of theoretically conceivable program executions are in fact infeasible.
For example, with respect to the text editor in Figure 3.1, to begin an
execution with a \close" action is not even an option, because no le
has been opened in the first place.
These two notions enable us to distinguish between valid and invalid traces;
those that belong to the prescribed and proscribed language of the underlying
software system. This in turn enables us to employ more sophisticated infer-
ence algorithms than the k-tails algorithm, as will be discussed in following
subsections.
3.3.2 Improved State-Merging Heuristics
As detailed previously, the k-tails strategy is prone to merging states that
are not equivalent, and not merging states when they should be merged. The
 
Search WWH ::




Custom Search