Path-Aware Static Program Analyses for Specification Mining* - Mining Software Specifications: Methodologies and Applications

Databases Reference

In-Depth Information

we derive the preconditions for f : \there exist a pair of locations (call them

a and b) such that a and b are used as arguments in an operation op 2 , and

hold the constant c ."

Surprisingly, by considering an alternative mapping of locations in the two

calls, we can deduce another equally valid specification. Prior to both calls it

is also the case that (i) two locations are allocated (` 1 and ` 2 in g 1 , and ` 3

and ` 5 in g 2 ) and used in operation op 1 ; and (ii) one location is written with

a constant (` 1 in g 1 and ` 5 in g 2 ).

To extract commonalities such as those among sets of predicates extant

at the two calls requires us to match locations, names, and constants across

these different sets. As the example illustrates, there are potentially many

such matches that can be constructed. Of course, some commonalities could

be extracted by examining the body of f , but this would compromise scala-

bility and modularity. Other commonalities can be derived by examining f 's

signature, the types of values stored in these locations, etc. We exploit some

of these heuristics in our implementation.

Our current formulation fails to identify any of them because it does not

abstract these components in the sets it constructs. Since the rules that de-

fine procedure abstraction and application compute preconditions based on

simple intersection of predicates from different call-sites or procedures, re-

spectively, they do not effectively deal with (a) programs that may have bugs

(and thus fail to reflect necessary preconditions along some paths) and (b)

programs which invoke a procedure at different call-sites whose preconditions

at these sites share only structural similarities. To address these limitations,

we require a more expressive matching operation than one based on simple

set intersection.

As we show in the next section, simply enumerating the set of all possible

matches over the predicate sets used to define preconditions is infeasible. We

therefore consider an alternate strategy to identify matches among the precon-

dition sets computed at different call-sites (or among procedures called at the

same call-site) inspired by data mining techniques. As we shall discuss, these

approaches sacrifice optimality for scalability and eciency; our experimental

results reveal that they yield surprisingly valuable specifications even in the

presence of complex control- and data-flow, even in the presence of bugs that

result in invariants being omitted along certain program paths.

9.2.5 Incorporating Mining

To make our discussion more concrete, consider the program fragment

shown in Figure 9.7.

This code fragment is from the procedure addlistenaddr found in file

serverconf.c from openssh 4.2p1. This procedure is called from the proce-

dure filldefaultserveroptions , which in turn precedes a call to bind .

In the body of the procedure, there are several calls to addonelistenaddr ,

which is responsible for setting an address family, eventually supplied as the

Mining Software Specifications: Methodologies and Applications

Search WWH ::

Custom Search

Home