Databases Reference
In-Depth Information
Figure 6.9. At first glance this seems like a strange strategy. Why limit access to table T1
by the processing resources of nodes 4, 5, and 6 when the table could be partitioned
across all six nodes? However, strangely, access to T1 may be faster when partitioned
across only nodes 4, 5, and 6 because access will be limited to the access efficiency of the
slowest node in the partition group. When the partition groups overlap as they do in
Figure 6.8, access to data in table T1 on node 1 is limited by competition for node 1
resources (disk bandwidth, CPU, memory, etc.) due to activity on tables T2 and T3 on
the same node. If tables T2 and T3 are hot, access to T1 on node 1 may be quite slow.
As a result, the coordinator will be limited in returning return results from T1 by the
speed of T1 on node 1. The chain is as strong as its weakest link. By designing the parti-
tion groups with nonoverlapping topologies the problem is resolved at the cost of over-
all reduced resources for table T1.
Figure 6.9
A six-node MPP with three nonoverlapping partition groups.
DB2 supports the notion of partition groups. With NCR Teradata all tables are
spread across all AMPs. This ensures that all AMPs are working on behalf of all tables,
and leads to a conceptually simple design, but does not allow the exploitation of subsets
of nodes. In general, overlapping topologies, such as the one illustrated in Figure 6.8,
are prone to skew, applying more work to some nodes than others. A best practice when
multiple partition groups are needed is to use a nonoverlapping model as shown in Fig-
ure 6.9.
Search WWH ::




Custom Search