Databases Reference
In-Depth Information
more in-depth discussion the reader is referred to the original paper by Lang
et al. [19] or the overview by Cicchello and Kremer [9].
3.3.3 Active Approaches
Any analysis approach that attempts to derive information about a soft-
ware system from traces alone is vulnerable to problem 3 in Section 3.2.3.
The selection of traces is critical to be able to draw valid conclusions about
the underlying software system [14]. This presents us with an important prob-
lem. In a setting where the developer is wanting to mine the specification for
the software system, it is unreasonable to expect very much (if any) prior
knowledge about how the system behaves. Consequently, it is unrealistic to
expect the ability to derive a collection of traces that is suciently diverse to
adequately exercise every aspect of software behavior.
There is an analogous problem in the field of grammar inference: How
does one obtain the set of sentences that are necessary to accurately infer a
grammar? One solution was proposed by Angluin in 1987 [2] in the form of
the L learning algorithm. This algorithm does not require any initial samples
at all; instead it proposes its own traces (called \membership queries"), and
assumes that there is an oracle (called a \minimally adequate teacher") that
can provide feedback to the algorithm. This feedback is provided in two forms:
(1) being able to state whether individual traces do or do not belong to the
target machine, and (2) being able to state whether a hypothesis machine is
equivalent to the target, and if not why not. Given that these answers can
be provided, Angluin established that it is possible to infer the exact state
machine in polynomial time.
The L algorithm works by constructing an \observation table," which
records whether certain sequences over are accepted or not. These se-
quences are asked in the form of membership queries, and the querying pro-
cess stops once the table has been fully populated (or in more formal terms {
it is closed and consistent ). Once this is the case, it is possible to conjecture
a hypothesis LTS. This solution is suitable for situations where the target
machine is relatively simple, and we have an oracle that can be asked as many
questions as are necessary. However, if the state machines are large and com-
plex, or there is a limit on the number of queries that can be answered by the
oracle, the algorithm becomes impractical.
Nonetheless, the L approach is appealing from the point of view that it
is active. The provision of an initial set of traces is no longer critical to its
ability to produce an accurate result. This idea of active inference has subse-
quently motivated work by Dupont et al., who developed an active variant of
the EDSM-Blue Fringe algorithm called the Question State Merging (QSM)
algorithm [12, 13]. The inference process is the same as with EDSM, but this
time the algorithm is equipped with the ability to ask questions after every
state-merge, to ensure that the merge is correct. In comparison to the L algo-
rithm, no guarantees can be made about the absolute correctness of the final
 
Search WWH ::




Custom Search