Querying Graph Databases - Advanced Database Query Systems

Database Reference

In-Depth Information

Figure 4. Sample graph representation using GString basic structures

computed hash value of the graph query is then used to identify and verify the set of queries that

matches the canonical codes of the graph query. A clear advantage of the GDIndex approach is that no

candidate verification is required. However, the index is designed for databases that consist of rela-

tively smaller graphs and do not have a large number of distinct graphs.

GString

The GString approach (Jiang et al., 2007) considers the semantics of the graph structures in the database.

It focuses on modeling graph objects in the context of organic chemistry using basic structures ( Line ,

Star and Cycle ) that have semantic meaning and use them as indexing features. Line structure denotes

a structure consisting of a series of vertices connected end to end, Cycle structure denotes a structure

consisting of a series of vertices that form a close loop and Star structure denotes a structure where a

core vertex directly connects to several vertexes. For a graph g, GString first extracts all Cycle structures,

then it extracts all Star structures, and finally, it identifies the remaining structures as Line structures.

Figure 4 represents a sample graph representation using the GString basic structures. GString represents

both graphs and queries on graphs as string sequences and transforms the subgraph search problem to

the subsequence string-matching domain.

A suffix tree-based index structure for the string representations is then created to support an efficient

string matching process. Given a basic structure, its GString has three components: type , size , and a set

of annotations ( edits ). For Line or Cycle, the size is the number of vertices in the structure. For Star, the

size indicates the fanout of the central vertex. For a query graph q , GString derives its summary string

representation which is then matched against the suffix-tree of the graph database. An element of a sum-

mary string matches a node in the suffix-tree if their types match, sizes are equal or the size in the

query is no more than the size in the node and the counts of corresponding types of edits in the query

are no larger than those in the node. A key disadvantage of the GString approach is that converting

subgraph search queries into a string matching problem could be an inefficient approach especially if

the size of the graph database or the subgraph query is large. Additionally, GString focuses on decom-

posing chemical compounds into basic structures that have semantic meaning in the context of organic

chemistry and it is not trivial to extend this approach in other domain of applications.

Search WWH ::

Custom Search

Home