Database Reference
In-Depth Information
6 Algebraic Optimization
of RDF Graph Pattern
Queries on MapReduce
Kemafor Anyanwu, Padmashree Ravindra,
and HyeongSik Kim
CONTENTS
6.1 Introduction .................................................................................................. 184
6.2
Data Processing Using MapReduce—An Overview .................................... 188
6.2.1
RDF Data Processing on MapReduce .............................................. 189
6.2.2
Case Study: Different Groupings of Star-Joins ................................ 189
6.3
Related Work ................................................................................................ 190
6.3.1
Distributed RDF Query Processing Systems ................................... 191
6.3.2
Query Processing Systems on Vanilla Hadoop Platform ................. 191
6.3.3
Query Processing Systems on Extended Hadoop Platforms ............ 192
6.3.4
Complementary Optimization Techniques on MapReduce.............. 193
6.4
SPARQL Query Compilation on Hadoop-Based Platforms—A Case
Study on Apache Pig..................................................................................... 193
6.4.1
Logical Plan Translation ................................................................... 193
6.4.2
Physical Plan Translation .................................................................. 194
6.4.3
MapReduce Plan Translation ............................................................ 194
6.5
An Alternative Algebra for Evaluating Graph Pattern Queries on
MapReduce ................................................................................................... 196
6.5.1
The Case for a “Groups of Triples” Data Model and Algebra ......... 196
6.5.2
The Nested TripleGroup Data Model and Algebra (NTGA) ............ 198
6.5.2.1
Content Equivalence .......................................................... 202
6.6
RAPID+—An Implementation of NTGA .................................................... 202
6.6.1
System Architecture ......................................................................... 202
6.6.2
SPARQL Query Compilation in RAPID+ ........................................ 203
6.6.3
Implementation of NTGA Operators ................................................ 205
6.6.3.1
Data Model Representation—RDFMap ............................ 205
6.6.3.2
Implementation of TG_GroupBy ....................................206
6.6.3.3
Implementation of TG_GroupFilter ...........................206
6.6.3.4
Implementation of TG_Join ............................................ 208
6.7
Case Study: Evaluation of NTGA Execution Plans ......................................209
6.7.1
Setup and Testbed .............................................................................209
183
 
Search WWH ::




Custom Search