DynaMine: Finding Usage Patterns and Their Violations by Mining Software Repositories - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

Certain categories of patterns can be gleaned from AntiPattern litera-

ture [13, 42], although many AntiPatterns tend to deal with high-level archi-

tectural concerns than with low-level coding issues. In the rest of this section,

we review literature pertinent to revision history mining and software model

extraction.

7.8.1 Revision History Mining

One of the most frequently used techniques for revision history mining is

co-change. The basic idea is that two items that are changed together are

related to one another. These items can be of any granularity; in the past

co-change has been applied to changes in modules [19], files [5], classes [6, 20],

and functions [51]. Recent research improves on co-change by applying data

mining techniques to revision histories [49, 53]. Michail used data mining on

the source code of programming libraries to detect reuse patterns, but not

for revision histories only for single snapshots [32, 33]. Our work is the first

to apply co-change and data mining based on method calls. While Fischer et

al. were the first to combine bug databases with dynamic analysis [18], our

work is the first that combines the mining of revision histories with dynamic

analysis.

The work most closely related to ours is that by Williams and

Hollingsworth [46]. They were the first to combine program analysis and re-

vision history mining. Their paper proposes error ranking improvements for a

static return value checker with information about fixes obtained from revision

histories. Our work differs from theirs in several important ways: They focus

on prioritizing or improving existing error patterns and checkers, whereas we

concentrate on discovering new ones. Furthermore, we use dynamic analysis

and thus do not face high false positive rates their tool suffers from.

Recently, Williams and Hollingsworth also turned toward mining function

usage patterns from revision histories [47]. In contrast to our work, they focus

only on pairs and do not use their patterns to detect violations.

7.8.2 Model Extraction

Most work on automatically inferring state models on components of soft-

ware systems has been done using dynamic analysis techniques. The Strauss

system [3] uses machine learning techniques to infer a state machine repre-

senting the proper sequence of function calls in an interface.

Dallmeier et al. trace call sequences and correlate sequence patterns with

test failures [12]. Whaley et al. [45] hardcode a restricted model paradigm

so that probable models of object-oriented interfaces can be easily automati-

cally extracted. Alur et al. [2] generalize this to automatically produce small,

expressive finite state machines with respect to certain predicates over an ob-

ject. Lam and Rinard use a type system-based approach to statically extract

Search WWH ::

Custom Search

Home