Databases Reference
In-Depth Information
One solution to this dilemma is to alter the static analysis to accept non-
declaratively complete programs. Partial program analysis, for example, guesses
the fully qualified names of any missing types using a number of contextual clues
[ 5 , 7 ]. This allows the static analysis to function in the presence of missing types,
but can degrade its performance because of missing information. With regards to
link analysis, the result is that many links will refer to unknown types or methods.
The difficulty of partial program analysis lies in the ambiguity of most language's
import mechanisms. Take Java as an example. Unresolved single type imports are
the best case, as they contain a fully qualified name, and so can be matched to
unresolved simple names. On-demand imports, those with a * operator, do not fully
specify which types they import, instead including all types within a given package
or type. This causes it to be unclear which package an unknown name belongs to.
It could be located in the same top-level package or any package for which an on-
demand import exists.
A different solution for accommodating declaratively incomplete programs is au-
tomated dependency resolution [ 13 ]. Automated dependency resolution attempts to
automatically locate artifacts that contain the missing declarations, restoring a pro-
gram to declarative completeness. Its primary benefit with regards to link analysis
is that previously unknown referents can now be resolved, improving the fidelity of
the link graph.
The first step is to identify the names of the missing types, which is done in
much the same manner as partial program analysis. Once the names are identified,
they are then matched against a collection of candidate artifacts that might contain
the missing declarations. The goal is to identify a set of artifacts that provide all
of the missing types while including a minimal number of extra unnecessary types.
When this approach was applied to a large test set of open source programs, it was
found to double the number of declaratively complete programs.
11.6 Dependency Slicing
So far, this chapter has provided an overview of how code retrieval systems function,
and how static analysis can be used to improve them. The remainder of this chapter
will describe in detail a single application of static analysis to code retrieval. This
should provide insight into the complexities involved with integrating static analysis
into code retrieval.
The application we will focus on is dependency slicing. Dependency slicing is
designed to identify the minimal set of declarations required for a set of seed dec-
larations to compile and execute properly, and is similar to approaches used for
reducing the size of jar files [ 15 ]. The purpose of dependency slicing is to package
up the result of a search so that it can be imported into a project and immediately
reused. CodeGenie, a tool for test-driven code reuse, uses Sourcerer's dependency
slicing service to integrate search results with test cases, in order to identify results
that satisfy the test cases.
Search WWH ::




Custom Search