Database Reference
In-Depth Information
6.7 CASE STUDY: EVALUATION OF NTGA EXECUTION PLANS
In this section, we present results comparing the performance of evaluating graph
pattern queries using NTGA execution plans against relational-style execution plans
in Apache Pig. We evaluated scalability in terms of number of join operations and
the size of the cluster.
6.7.1 s etuP anD t estbeD
Experiments were conducted on 5- to 30-node Hadoop clusters. Two synthetic data
sets were used—data set D 1 with size 51 GB (approximately 200 million n-triples)
generated using the BSBM benchmark generator and D 2—modiied version of the
analysis benchmark data set used in [34] with size 43 GB (approximately 1 billion
3-a r y tr iples).
6.7.2 s Calability with i nCreasing J oins
Figure 6.11a shows the results of queries—Q1, Q2, Q3, Q4 with 3, 5, 9, and 11 joins,
respectively. The number of star subpatterns varies from one in Q1 to four in Q4,
(a)
7000
Pig
RAPID+
6000
5000
4000
3000
2000
1000
0
Q1
Q2
Q3
Q4
(b)
9000
Pig
RAPID+
8000
7000
6000
5000
4000
3000
2000
1000
0
10-nodes
20-nodes
30-nodes
FIGURE 6.11 Scalability study with (a) increasing joins using the 51 GB data set on a
10-node cluster and (b) increasing cluster size using the 43 GB data set.
Search WWH ::




Custom Search