Mining API Usage Protocols from Large Method Traces - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

4.6

Related Work ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 105

4.6.1

Mining of Finite State Machines :::::::::::::::::::::::::::::::::: 105

4.6.2

Mining of Temporal Rules and Other Specications ::::::::::::: 107

4.6.3

Invariant Inference :::::::::::::::::::::::::::::::::::::::::::::::: 108

4.6.4

Other Related Work ::::::::::::::::::::::::::::::::::::::::::::::: 108

4.7

Conclusions :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 108

Bibliography ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 108

4.1 Introduction

Large software is typically built from several components that are reused

among different projects. Programmers access functionality from such compo-

nents via an application programming interface (API). While reusing existing

components can increase programmer productivity, API usage also implies

certain diculties. One such diculty is that many APIs implicitly constrain

the order of API method calls. For instance, using an output stream typically

follows the protocol: open the stream, write to it an arbitrary number of times,

and close the stream. Violating this protocol can lead to severe programming

errors, such as resource leakage due to unreleased resources. Although such

constraints exist for most APIs, few of them are documented appropriately.

Moreover, ordering constraints of method calls are rarely formalized, making

it impossible to check them automatically.

We address the problem of missing specifications of typical and correct API

usage by automatically inferring API usage protocols. This chapter presents a

dynamic analysis that mines common usage patterns from existing programs

using an API. The analysis produces finite state machines (FSMs) that de-

scribe the order in which API methods are typically called. Our approach is to

learn from the runtime behavior of programs, which provides several benefits.

First of all, dynamic analysis has precise and unambiguous information about

the program's control ow, aliasing between objects, and the type of each ob-

ject. Furthermore, execution frequencies help to focus on recurring patterns

while ignoring incidental call sequences.

A major challenge of dynamic specification inference is the large volume

of runtime events occurring in real-world programs. The presented analysis

focuses on small sets of related objects and method calls, object collaborations,

that appear together during program execution. Each object collaboration

can be analyzed separately. As a result, the analysis scales linearly with the

number of runtime events to analyze.

Experiments with an implementation of the analysis show that it infers

typical API usage protocols while being scalable to large volumes of runtime

events. Overall, we applied the analysis to 10 Java programs and analyzed 280

million method calls and returns. The largest set of runtime events contains

Mining Software Specifications: Methodologies and Applications

Search WWH ::

Custom Search

Home