Database Reference
In-Depth Information
SELECT * WHERE{
?producer type Producer .
?producer label ?prcLabel .
?producer date ?prcdate .
?producer hpage ?hpage .
SJ 1
?product pub ?producer .
?product type Product .
?product label ?prodLabel .
?product date ?prodDate .
}
SJ 2
(a)
MR Job 1
MR Job 1 ( SJ 1)
M :
POTGGroupAnnotator
(pub)
(label)
(date)
(hpage)
M :
POUnion
POTGGroupPackage
R :
POJoinPackage
R :
MR Job 2 ( SJ 2)
(pub)
(type)
(label)
(date)
M :
POUnion
R :
POJoinPackage
MR Job 2
MR Job 3 ( SJ 1 SJ 2)
M :
POTGJoinAnnotator
M :
POLocalRearrange
POTGJoinPackage
R :
POJoinPackage
R :
(b)
(c)
FIGURE 6.12 (a) Example SPARQL query, (b) MR execution workflow for the example
query using NTGA operators, and (c) corresponding workflow using Pig Latin operators.
enable scan sharing while processing graph pattern queries with repeated proper-
ties. In NTGA, the scan sharings are naturally made while processing such graph
patterns because all star subpatterns are executed as a grouping operation, requiring
only one scan of the entire input set. Therefore, the triples for a particular property
are scanned only once regardless of how many times that property is used in a query.
Figure 6.12b shows the NTGA-based MR workflows, which scans the input data set
only once. However, this scan sharing in the presence of repeated properties can lead
to ambiguities in the semantics of triplegroups because triplegroups are assumed to
Search WWH ::




Custom Search