Adapting Grammar Inference Techniques to Mine State Machines - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

characteristic set of traces is practically infeasible for any non-trivial software

system.

(4) Reliance on traces alone often renders the inference process

more expensive than it needs to be. As stated in the previous point,

the size of the trace-set that is necessary to infer an accurate state machine

tends to be too large to collect in practice. However, it is often the case that

the user of the technique is aware of certain constraints or rules that would

obviate the need for a potentially vast number of traces. The problem is that

the conventional reverse-engineering process does not provide a means for

supplying such facts.

The rest of this chapter will introduce selection approaches that can be

used to address these problems. For problems 1 through 3, we resort to tech-

niques that are either directly lifted from or inspired by the closely related field

of (regular) grammar inference (presented in the following section). For prob-

lem 4 we describe a technique of our own [34], which enables the integration

of temporal constraints into the inference process.

3.3 Applying Advances from the Field of Grammar

Inference

The field of grammar inference is concerned with the development of al-

gorithms or procedures that can be used to infer grammars from samples of

their languages. Its roots can be traced back to the 1950s, in Moore's work

on \gedanken experiments" on state machines [24] and Nerode's work on the

synthesis of machines from equivalence relations [25]. However it was Gold's

work in 1967 that was arguably most influential, establishing the theoretical

limits of grammar learnability [16, 29], and forming a theoretical framework

for the current techniques.

The field is very broad; a host of different approaches have been developed

to suit different grammar classes and learning settings. Regular grammars are

the simplest class of formal grammars, and have consequently been the subject

of most inference research. The reason that regular grammar inference is of

particular import to this chapter is that regular grammars can be represented

as deterministic labeled transition systems [18]. Consequently, techniques that

apply to the inference of regular grammars automatically apply to the problem

of inferring state machines.

The k-tails technique was initially published as in the context of generic

state machine inference [4], without any specific applications to the domain of

software engineering. It was four years later [5] that the authors applied it to

infer state machines from program traces. Since then the k-tails technique has

Search WWH ::

Custom Search

Home