Path-Aware Static Program Analyses for Specification Mining* - Mining Software Specifications: Methodologies and Applications - page 315

Databases Reference

In-Depth Information

There are several significant design issues that need to be resolved for the

derived preconditions to have any practical significance. We observe that using

simple set intersection on predicates is too fragile to yield interesting specifi-

cations in general. This is because the predicates generated are insuciently

abstract (e.g., \at a call to procedure p, variable x bound in p is read, and the

contents of locations a and b allocated in p are compared"). The intersection

of any set S with a set containing just these predicates would be non-empty

only if S contained identical predicates, reflecting the same operations on the

same variables and locations. To relax this limitation, we examine techniques

that allow us to define structural similarities among predicate sets. Such simi-

larities enable preconditions that specify properties which are abstracted over

variable names, references, and values. We are thus able to define precon-

ditions that dene more abstract properties such as \procedure q is called

whenever some integer variable v is greater than zero, and the contents of

some pair of locations l 1 and l 2 holding a value of type are equal."

While the use of intersection guarantees safety by ensuring that derived

preconditions for a procedure hold at all call-sites, it is not a robust mechanism

in the presence of errors. An error that causes a predicate to be omitted along

some path leading to a call to procedure p would result in the predicate not

being included as part of p's preconditions. To address this concern, we employ

frequent itemset and sequence mining on the predicates computed at each

call-site to p, and use the predicates that are most frequently occurring as

the preconditions for p. Like other mining-based approaches, we assume that

errors violating invariants occur infrequently, thus making mining a feasible

strategy to filter such deviations from the generated specifications.

9.2.1 Precedence Protocols

A particular important class of specifications is that which defines prece-

dence protocols, a protocol that dictates the order in which different program

components (such as functions) are executed. For example, a call to function

pthreadmutexinit must be present upstream on any program path from

a call to pthreadmutexlock , since the former initializes data structures

associated with the mutex into an initially unlocked state. 1 The precedence

relation between pthreadmutexinit and pthreadmutexlock forms a part

of a precedence protocol that defines how these functions can be used in pro-

grams.

In some cases, precedence protocols are well specified. Certain groups of

library functions, 2 (e.g., a call to accept should always be preceded by a call to

bind and socket , a call to pthreadmutexlock must always be preceded by

1 Note that this precedence relation does not imply that every call to pthreadmutexlock

is preceded by a unique corresponding call to pthreadmutexinit ; rather, it states that

there must be some pthreadmutexinit upstream from the call-site.

2 Our presentation throughout this chapter applies to both functions and procedures,

and we use the two terms interchangeably.

Next Page

Mining Software Specifications: Methodologies and Applications

Search WWH ::

Custom Search

Home