Static Specification Mining Using Automata-Based Abstractions* - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

abstractions, as well as our summarization algorithms. We also expect these

abstractions to be useful in the context of other analyses that track temporal

sequences.

6.8

Related Work

Dynamic Analysis. When it is feasible to run a program with adequate cov-

erage, dynamic analysis represents the most attractive option for specification

mining, since dynamic analysis does not suffer from the diculties inherent

to abstraction.

Cook and Wolf [5] consider the general problem of extracting a FSM model

from an event trace, and reduce the problem to the well-known grammar infer-

ence [14] problem. Cook and Wolf discuss algorithmic, statistical, and hybrid

approaches, and present an excellent overview of the approaches and funda-

mental challenges. This work considers mining automata from uninterpreted

event traces, attaching no semantic meaning to events.

Ammons et al. [2] infer temporal and data dependence specifications based

on dynamic trace data. This work applies sophisticated probabilistic learning

techniques to boil traces down to collections of finite automata which char-

acterize the behavior. Lo and Khoo [20] extend Ammons' work, and employ

machine learning techniques to mine probabilistic temporal specifications from

dynamic execution traces.

Whaley et al. [31] present a two-phased approach to mining temporal API

interfaces, combining a static component-side safety check with a dynamic

client-side sequence mining. The static analysis is extremely simple, and used

primarily to restrict the dynamic search of temporal sequences, rather than

to directly infer specifications. This work presents several insights on how to

refine results, based on side-effect free methods, and partitioning methods

based on how they access fields. We plan to incorporate these insights into a

future version of our analysis.

The Perracotta tool [32] addressed challenges in scaling dynamic mining of

temporal properties to large code bases. This tool mines traces for two-event

patterns with an ecient algorithm, and relies on heuristics to help identify

interesting patterns.

Livshits and Zimmermann [19] mine a software repository revision history

to find small sets of methods whose usage may be correlated. This analysis is

simple and scalable; in contrast to ours, it does not consider temporal ordering

nor aliasing. In a second (dynamic) phase, the system checks whether candi-

date temporal patterns actually appear in representative program traces. Our

analysis technology could perhaps be employed in a similar architecture to

provide a more effective first phase of mining.

Search WWH ::

Custom Search

Home