Database Reference
In-Depth Information
Table 1. Summary of relational techniques for processing RDF queries
Query Engine
Class
Optimization Technique(s)
3store (Harris & Gibbins, 2003)
Vertical Store
Hash tables
RDF-3X (Neumann & Weikum, 2008)
Vertical Store
Exhaustive indexing of all permutations of (S,P,O) Merge joins,
Cost-based query optimization
Hexastore (Weiss et al., 2008)
Vertical Store
Materialization of all orders of the 3 RDF elements
RDFMATCH (Chong et al., 2005)
Property Tables
Materialized join views based on user demand and query workloads
(Levandoski & Mokbel, 2009)
Property Tables
Automated inference independent of query workloads
(Matono et al., 2005)
Property Tables
Path-based storage of RDF data
SW-Store (Abadi et al., 2009)
Horizontal Stores
Column-oriented storage of RDF data
EXPERIMENTAL EVALUATION
In this section, we present an experimental evaluation for the different approaches which are relying on
the relational infrastructure to provide scalable engines to store and query RDF data (MahmoudiNasab
& Sakr, 2010).
SP 2 Bench Performance Benchmark
Schmidt et al. (2009) have presented the SP ARQL P erformance Bench mark (SP 2 Bench) which is based
on the DBLP scenario ( DBLP XML Records , 2009). The DBLP database presents an extensive biblio-
graphic information about the field of Computer Science and, particularly, databases. The benchmark
is accompanied with a data generator which supports the creation of arbitrarily large DBLP-like models
in RDF format. This data generator mirrors the vital key characteristics and distributions of the original
DBLP dataset. The logical RDF schema for the DBLP dataset consists of Authors and Editors entities
which are representation types of Persons . A superclass Document which is decomposed into several
sub-classes: Proceedings, Inproceedings, Journal, Article, Book, PhDThesis, MasterThesis, Incollec-
tion, WWW resources. The RDF graph representation of these entities reflects their instantiation and the
different types of relationship between them.
In addition, the benchmark provides 17 queries defined using the SPARQL query language on top of
the structure of the DBLP dataset in a way to cover the most important SPARQL constructs and operator
constellations. The defined queries vary in their complexity and result size. Table 2 lists the SP 2 Bench
Benchmark Queries. For more details about the benchmark specification, data generation algorithm and
SPARQL definition of the benchmark queries, we refer the reader to (Schmidt et al., 2009).
Experimental Settings
Our experimental evaluation of the alternative relational RDF storage techniques are conducted using the
IBM DB2 DBMS running on a PC with 3.2 GHZ Intel Xeon processors, 4 GB of main memory storage
and 250 GB of SCSI secondary storage. We used the SP2Bench data generator to produce four different
testing datasets with number of triples equal to: 500K, 1M, 2M and 4M Triples. In our evaluation, we
consider the following four alternative relational storage schemes:
 
Search WWH ::




Custom Search