Querying Graph Databases - Advanced Database Query Systems

Database Reference

In-Depth Information

and searching operations can be effectively improved. Moreover, operations on trees, such as isomorphism

and normalization are asymptotically simpler than graphs, which are usually NP-complete (Fortin, 1996).

Zhao et al. (2007) have extended the ideas of TreePi (Zhang et al., 2007) to achieve better pruning

ability by adding a small number of discriminative graphs (Δ) to the frequent tree-features in the index

structure. They propose a new graph indexing mechanism, named (Tree+Δ), which first selects frequent

tree-features as the basis of a graph index, and then on-demand selects a small number of discriminative

graph-features that can prune graphs more effectively than the selected tree-features, without conducting

costly graph mining beforehand.

SUPERGRAPH QUERY PROCESSING

The supergraph query-processing problem is important in practice. However, it has not been extensively

considered in the research literature. Only two approaches have been presented to deal with this problem.

Chen et al. (2007) have presented an approach named cIndex as the first work on supergraph query pro-

cessing. The indexing unit of this approach is the subgraphs which extracted from graph databases based

on their rarely occurrence in historical query graphs. Sometimes, the extracted subgraphs are very similar

to each other. Therefore, cIndex tries to find a set of contrast subgraphs that collectively perform well.

It uses a redundancy-aware selection mechanism to sort out the most significant and distinctive contrast

subgraphs. The distinctive subgraph is stored in a hierarchical fashion using bottom-up and top-down

proposed approaches. During query processing, cIndex reduces the number of subgraph isomorphism

testing by using the filtering and verification methodology. An advantage of the cIndex approach is that

the size of the feature index is small. On the other hand, the query logs may frequently change over

time so that the feature index maybe outdated quite often and need to be recomputed to stay effective.

Zhang et al. (2009) have investigated the supergraph query-processing problem from another angle.

They proposed an approach for compact organization of graph database members named GPTree . In this

approach, all of the graph database members are stored into one graph where the common subgraphs

are stored only once. An algorithm for extracting the key features from graph databases is used to con-

struct the feature indexes on graph database members. Based on the containment relationship between

the support sets of the extracted features, a mathematical approach is used to determine the ordering of

the feature set which can reduce the number of subgraph isomorphism tests during query processing.

GRAPH SIMILARITY QUERIES

The problem of similarity (approximate) subgraph queries has been addressed by different research efforts

in the literature. Given a query graph and a database of graphs, these approaches try to find subgraphs in

the database that are similar to the query. Therefore, these approaches can allow for node mismatches,

node gaps (Gap node is a node in the query that cannot be mapped to any node in the target graph), as

well as graph structural differences. Approximate graph matching techniques are used in some cases

when the graph databases are noisy or incomplete. In these cases, using approximate graph matching

query-processing techniques can be more useful and effective than exact matching. In this section, we

give an overview of the main approaches which address this problem.

Search WWH ::

Custom Search

Home