Database Reference
In-Depth Information
Algorithm 2 Predicate/Join Processing Algorithm
Input: Table t א
, predicates of Q, required attributes in result
Output: Result of t
1: if there is a predicate on non-PK/non-FK then
2: if d == 0 for t then
3: Apply predicate on t to get the record ids
4: Store the record-id mapping in the format
5: (rec-id 1 , rec-id 2 ,….)
6: Communicate if necessary with other nodes
7 : else if any table t 1 with d 1 <= d referenced by t then
8: Apply predicate on t
9: Update the mapping with rec-ids of t
10: Perform line 9
11: Eliminate mappings which has no match for t
12: else
13: Perform similar to line 6, 9 and 14
14: end if
15: else if there is a predicate on PK or FK then
16: if d == 0 for t then
17: Scan PK-map and tuple-index-map
18: Perform line 6 to 8
19: else
20: Scan PK-map and tuple-index-map for those rec-ids stored
for table t 1 with d 1 <= d that is referenced by t
21: Perform 12 and 14
22: end if
23: end if
24: Scan tables of T for final mappings (rec-id 1 ,…….) to get the value of
other attributes in the select statement of Q
25: return Result
(b) Join Processing Algorithm [17]
T
(a) Aggregate Query Processing Algorithm
Fig. 4. Query Processing Algorithms
Also, our maps are already sorted on keys which further eliminates most of the
sort operations. At last we retrieve remaining attributes required for the result.
4 Performance Evaluation
In this section, we present the performance study to show the effectiveness of our
proposed PK-map and Tuple-index-map structures while processing aggregate
queries (using Algorithms in Figure 4a and Figure 4b). We will compare the
performance between MySQL and our proposed framework on a large-scale cloud
network called PlanetLab with 150GB of TPC-H star schema data.
PlanetLab [12] [13] is a geographically distributed computing platform avail-
able as a testbed for deploying, evaluating, and accessing planetary-scale net-
work services. It is currently composed of around 1050 nodes (servers) at 400
sites (location) worldwide.
For performance study of this paper we chose 50 PlanetLab machines world-
wide running Red Hat 4.1 Operating System. Each machine has 2.33GHz Intel
Core 2 Duo processor, 4GB RAM and 10GB disk space. We installed regular
MySQL on all of the machines to perform experiments.
We generated 150GB of data using the data generator ”dbgen”, provided by
TPC-H benchmark and distributed it to 50 PlanetLab machines. Each of these
machines store around 3GB data fragments of TPC-H schema relations. We gen-
erated PK-maps and Tuple-index-maps, and then horizontally partitioned them
Search WWH ::




Custom Search