Database Reference
In-Depth Information
5 Large-Scale RDF
Processing with
MapReduce
Alexander Schätzle, Martin Przyjaciel-Zablocki,
Thomas Hornung, and Georg Lausen
CONTENTS
5.1 Introduction .................................................................................................. 151
5.2 Foundations ................................................................................................... 153
5.2.1 RDF and SPARQL ............................................................................ 153
5.2.2 MapReduce ....................................................................................... 154
5.2.2.1 Map-Side vs. Reduce-Side Join ......................................... 154
5.2.3 Pig Latin ........................................................................................... 155
5.3 SPA RQL Translation .................................................................................... 157
5.3.1 RDF Data Mapping .......................................................................... 157
5.3.2 Algebra Translation .......................................................................... 158
5.3.3 Optimizations ................................................................................... 162
5.3.4 Example ............................................................................................ 163
5.4 PigSPA RQL Evaluation ................................................................................ 164
5.5 RDF Storage Schema for HBase .................................................................. 167
5.6 MAPSIN Join ............................................................................................... 170
5.6.1 Base Case .......................................................................................... 170
5.6.2 Cascading Joins ................................................................................ 172
5.6.3 Multiway Join Optimization ............................................................. 173
5.6.4 One-Pattern Queries ......................................................................... 174
5.7 MAPSIN Evaluation ..................................................................................... 175
5.8 Related Work ................................................................................................ 177
5.9 Conclusion .................................................................................................... 179
References .............................................................................................................. 180
5.1 INTRODUCTION
Most of the information in the classical “Web of Documents” is designed for human
readers, whereas the idea behind the semantic web is to build a “web of data” that
enables computers to understand and use the information in the web. The advent
of this web of data gives rise to new challenges with regard to query evaluation
151
 
Search WWH ::




Custom Search