Querying Graph Databases - Advanced Database Query Systems

Database Reference

In-Depth Information

second type consists of a large set of small graphs such as chemical compounds and biological pathways

( transactional graph databases ). The main focus of this chapter is on giving an overview of the efficient

indexing and querying mechanisms on the second type of graph databases.

Graph Queries

In principle, queries in transactional graph databases can be broadly classified into the following main

categories:

1. Subgraph queries: this category searches for a specific pattern in the graph database. The pat-

tern can be either a small graph or a graph where some parts of it are uncertain, e.g., vertices with

wildcard labels. Therefore, given a graph database D = {g 1 ,g 2 ,..., g n } and a subgraph query q , the

query answer set A = {g i |q ∧ g i ,g i ∧ D}. A graph q is described as a subgraph of another graph

database member gi if the set of vertices and edges of q form subset of the vertices and edges of

g i . To be more formal, let us assume that we have two graphs g 1 (V 1 , E 1 , L v1 ,L e1 ,F v1 ,F e1 ) and g 2 (V 1 ,

E 2 , L v2 ,L e2 ,F v2 ,F e2 ) , g 1 is defined as subgraph of g 2 , if and only if:

◦ For every distinct vertex x ∧ V 1 with a label v l ∧ L v1 , there is a distinct vertex y ∧ V 2 with a

label v l ∧ L v2 .

◦ For every distinct edge edge ab ∧ E 1 with a label e l ∧ L e1 , there is a distinct edge ab ∧ E 2

with a label e l ∧ L e2 .

Figure 1(a) illustrates the subgraph search problem. Figure 2(a) shows an example of a graph data-

base. Figure 2(b) illustrates examples of graph queries ( q 1 and q 2 ). Let us assume that these queries

are subgraph queries. If we evaluate these queries over the sample graph database (Figure 2(a))

then the answer set of q 1 will consist of the graph database members g 1 and g 2 while the answer

set of q 2 will be empty. The more general type of the subgraph search problem is the subgraph

isomorphism search problem, which is defined as follows. Let g 1 =(V 1 ,E 1 ,L v1 ,L e1 ,F v1 ,F e1 ) and g 1

=(V 2 ,E 2 ,L v2 ,L e2 ,F v2 ,F e2 ) be two graphs, g 1 is defined as a graph isomorphism to g 2 , if and only if

there exists at least one bijective function f: V 1 → V 2 such that: 1) for any edge uv ∧ E 1 , there is an

edge f(u)f(v) ∧ E 2 . 2) F v1 (u)= F v2 (f(u)) and F v1 (v)= F v2 (f(v)) . 3) F e1 (uv)= F e2 (f(u)f(v)) .

2. Supergraph queries: searches for the graph database members of which their whole structures are

contained in the input query. Therefore, given a graph database D = {g 1 ,g 2 ,..., g n } and a supergraph

query q , the query answer set A = {g i |q ∧ g i ,g i ∧ D} . Figure 1(b) illustrates the subgraph search

problem. Let us assume that the graph queries of Figure 2(b) are supergraph queries.

If we evaluate these queries over the sample graph database (Figure 2(a)) then the answer set of q 1

will be empty while the answer set of q 2 will contain the graph database member g 3 .

3. Similarity (Approximate Matching) queries: this category finds graphs which are similar , but

not necessarily isomorphic to a given query graph. Given a graph database D = {g 1 ,g 2 ,..., g n } and

a query graph q , similarity search is to discover all graphs that are approximately similar to the

graph query q . A key question in graph similarity queries is how to measure the similarity between

a target graph member of the database and the query graph. In fact, it is difficult to give a precise

definition of graph similarity. Different approaches have proposed different similarity metrics

for graph data structures (Bunke & Shearer, 1998; Fern•andez & Valiente, 2001; Raymond et al.,

2002). Discussing these different similarity metrics is out of the scope of this chapter. We refer the

interested reader to a detailed survey in (Gao et al., 2009).

Search WWH ::

Custom Search

Home