Database Reference
In-Depth Information
TABLE 5.2
T s_po Table for RDF Graph in Figure 5.6
Rowkey
Family:Column Value
Article1
p:title→{“PigSPARQL”},
p:year→{“2011”},
p:author→{Alex, Martin}
Article2
p:title→{“RDFPath”},
p:year→{“2011”},
p:author→{Martin, Alex},
p:cite→{Article1}
TABLE 5.3
T o_ps Table for RDF Graph in Figure 5.6
Rowkey
Family:Column Value
“2011”
p:year→{Article1, Article2}
“PigSPARQL”
p:title→{Article1}
“RDFPath”
p:title→{Article2}
Alex
p:author→{Article1, Article2}
Article1
p:cite→{Article2}
Martin
p:author→{Article2, Article1}
"PigSPARQL"
SPARQL BGP query
SELECT *
WHERE {
?article title ?title
?article author ?author
?article year ?year
}
title
Article1
author
Alex
year
"RDFPath"
author
cite
author
"2011"
title
Martin
Article2
author
year
"2011"
FIGURE 5.6
RDF graph and SPARQL query.
side such that no unnecessary data must be transferred over the network ( predi-
cate push-down ). As already mentioned in [25], a table with predicates as row keys
causes scalability problems since the number of predicates in an ontology is usually
fixed and relatively small, which results in a table with just a few very fat rows.
Considering that all data in a row is stored on the same machine, the resources of a
single machine in the cluster become a bottleneck. Indeed, if only the predicate in a
triple pattern is given, we can use the HBase Filter API to answer this request with
a table scan on T s_po or T o_ps using the predicate as column filter. Table 5.4 shows the
mapping of every possible triple pattern to the corresponding HBase table. Overall,
experiments on our cluster showed that the two-table schema with server side filters
 
Search WWH ::




Custom Search