Databases Reference
In-Depth Information
collected from the literature, newsgroups, and previous bug reports, applica-
tion programmers are rarely able to tell which invariants the APIs they use
have. The situation is only slightly better when it comes to software architects
and API designers who are generally much more aware of application-specific
patterns.
In this chapter we propose an automatic way to extract likely error pat-
terns by mining software revision histories. Moreover, in order to ensure that
all the errors we find are relatively easy to confirm and fix, we pay particu-
lar attention in our experiments to errors that can be fixed with a one-line
change. It is worth pointing out that many well-known error patterns such as
memory leaks, double- free 's, mismatched locks, open and close operations on
operating system resources, buffer overruns, and format string errors can often
be addressed with a one-line fix. Looking at incremental changes between revi-
sions as opposed to complete snapshots of the source allows us to better focus
our mining strategy and obtain more precise results. Our approach uses revi-
sion history information to infer likely error patterns. We then experimentally
evaluate the patterns we extracted by checking for them dynamically.
We have performed experiments on Eclipse and jEdit, two large, widely
used open-source Java applications. Both Eclipse and jEdit have many man-
years of software development behind them and, as a collaborative effort of
hundreds of people across different locations, are good targets for revision his-
tory mining. By mining CVS, we have identified 56 high-probability patterns
in Eclipse and jEdit APIs, all of which were previously unknown to us. Out
of these, 21 were dynamically confirmed as valid patterns and 263 pattern
violations were found.
7.1.1 Contributions
This chapter makes the following contributions:
We present DynaMine, 1 a tool for discovering usage patterns and detect-
ing their violations in large software systems [28, 29]. All of the steps
involved in mining and running the instrumented application are acces-
sible to the user from within an Eclipse plugin: DynaMine automates the
task of collecting and pre-processing revision history entries and mining
for common patterns. Likely patterns are then presented to the user for
review; runtime instrumentation is generated for the patterns that the
user deems relevant. Results of dynamic analysis are also presented to
the user in an Eclipse view.
We propose a data mining strategy that detects common usage patterns
in large software systems by analyzing software revision histories. Our
strategy is based on a classic Apriori data mining algorithm, which we
1 The name DynaMine comes from the combination of D ynamic analysis and M ining
revision histories.
 
Search WWH ::




Custom Search