Databases Reference
In-Depth Information
In addition to the three fields for method headers there are two special fields for
class constructors. Fortunately, we do not need to store the class name here as this is
unique for every stored entity and can thus be added as necessary via an AND query
on the name field.
5.5.1 Query Parsing
Although it might seem simple at first sight, finding a good approach to generically
formulate queries for software search engines is an interesting problem in its own
right since at least three different styles of query formulation need to be supported
- namely free text queries, signature-based queries and API-based queries. Once a
search infrastructure as described above is in place it is in principle possible to use
the field structure of the index to support searches on directly expressed queries.
In simple cases this might be satisfactory (as e.g. for name-based queries in the
commercial Koders search engine), however, for API-driven searches this would
require a deeper knowledge and understanding of the internal index structure and
hence is definitively an unintuitive approach.
In order to overcome this challenge, it makes sense to support a query formu-
lation and parsing approach that only uses concepts that are already familiar to
software developers (i.e. the users of a software search engine). To our knowledge,
Merobase is currently the only software search engine that supports such an ap-
proach directly by accepting code in supported programming languages as queries
(the Eclipse plugin CodeGenie has a similar capability on the client side, though
[ 34 ]). Thus, all queries in Merobase are first fed to a modified Java/C# parser that
is able to recognize class and method interfaces such as the following API-based
search for a simple Matrix class:
public class Matrix {
public Matrix add(Matrix m) {}
public Matrix multiply(Matrix m) {}
}
This approach easily allows developers to search in their favorite programming
language and even makes the development of recommender tools that issue searches
directly from a development environment straightforward. However, since a query
interface should ideally be programming language independent and customizable,
Merobase also includes a small intuitive retrieval language that supports method
headers (such as random(float,float):float;) and class interface descriptions as
programming language independent input. It is based on the textual representa-
tion of operations in UML class diagrams. The following snippet provides an
example:
Customer (
getAddress():String;
Search WWH ::




Custom Search