Databases Reference
In-Depth Information
Being based on Lucene, Sourcerer's index model is quite flexible. Depending on
a specific search application, an instance of a Sourcerer's index schema can have a
subset of various field types listed above. The three code search applications built
on top of Sourcerer have used code index schemas with different configurations of
fields and associated data sources.
Fields for retrieval with signatures allowed precise construction of queries
for expressing desired method signatures and relations expected in test
cases in CodeGenie. Fields storing retrieval based on structural similarity
enabled retrieval schemes in SSI, and more like this queries based on
usage in SAS. Rest of the index fields supported basic operations of the
code search applications as in SCSE.
8.4.3.1 Structured Retrieval
Tab le 8.4 presents a subset of the fields available in the Sourcerer index. Sourcerer's
search index can be searched using Lucene's query language [ 25 , 41 ]. The following
Lucene query demonstrates how different fields are utilized to express a query that
incorporates textual as well as structural information:
short_name: (day of week)
AND entity_type: METHOD
AND m_ret_type_sname_contents: String
AND m_args_fqn_contents: date
AND cdef: (date util)
Index field
Description
Fields for basic retrieval
fqn_contents
Tokenized terms from the FQN of an entity
short_name
Right most fragment of the FQN (w/o method
arguments for methods)
Fields for retrieval with signatures
m_args_fqn_contents
Method's
formal
arguments
tokenized
into
terms
m_ret_type_sname_contents
Short name of the method's return type tok-
enized into terms
Fields Storing metadata
entity_type
String
representation
of
entity
type.
(e.g.,
“CLASS”)
Fields for navigation
fan_in_mcall_local
Entity ids of all local callers for a method from
the same project
Table 8.4: Sample search index fields
 
Search WWH ::




Custom Search