Querying Graph Databases - Advanced Database Query Systems

Database Reference

In-Depth Information

in all graph-based applications. A primary challenge in computing the answers of graph queries is that

pair-wise comparisons of graphs are usually really hard problems. For example, subgraph isomorphism

is known to be NP-complete (Garey & Johnson, 1979). A naive approach to compute the answer set of

a graph query q is to perform a sequential scan on the graph database and to check whether each graph

database member satisfies the conditions of q or not. However, the graph database can be very large which

makes the sequential scan over the database impracticable. Thus, finding an efficient search technique

is immensely important due to the combined costs of pair-wise comparisons and the increasing size of

modern graph databases. It is apparent that the success of any graph database application is directly

dependent on the efficiency of the graph indexing and query processing mechanisms. Recently, there

are many techniques that have been proposed to tackle these problems. This chapter gives an overview

of different techniques of indexing and querying graph databases and classifies them according to their

target graph query types and their indexing strategy.

The rest of the chapter is organized as follows. The Preliminary section introduces preliminaries of

graph databases and graph query processing. In Section (Subgraph Query Processing), a classification

of the approaches of subgraph query processing problem and their index structures is given while the

section (Supergraph Query Processing) focuses on the approaches for resolving the supergraph query

processing problem. Section (Graph Similarity Queries) discusses the approach of approximate graph

matching queries. Section (Graph Query Languages) gives an overview of several proposals of graph

query languages. Finally, Section (Discussion and Conclusions) concludes the chapter and provides

some suggestions for possible future research directions on the subject.

PRELIMINARIES

In this section, we introduce the basic terminologies used in this chapter and give the formal definition

of graph querying problems.

Graph Data Structure

Graphs are vey powerful modeling tool. They are used to model complicated structures and schemaless

data. In graph data structures, vertices and edges represent the entities and the relationships between

them respectively. The attributes associated with these entities and relationships are called labels. A

graph database D is defined as a collection of member graphs D = {g 1 ,g 2 ,..., g n } where each member

graph database member g i is denoted as (V, E, L v ,L e ,F v ,F e ) where V is the set of vertices; E ∧ V * V is

the set of edges joining two distinct vertices; L v is the set of vertex labels; L e is the set of edge labels;

F V is a function V → L v that assigns labels to vertices and F e is a function E → L e that assigns labels to

edges. In general, graph data structures can be classified according to the direction of their edges into

two main classes:

• Directed-labeled graphs : such as XML, RDF and traffic networks.

• Undirected-labeled graphs : such as social networks and chemical compounds.

In principle, there are two main types of graph databases. The first type consists of few numbers of

very large graphs such as the Web graph and social networks ( non-transactional graph databases ). The

Search WWH ::

Custom Search

Home