Database Reference
In-Depth Information
by tracking the state of file splits in a central repository. Each time an adaptive
mapper finishes processing a split, it consults this central repository and locks
another split for processing until the job is completed. This means that for
adaptive mappers, only a single wave of mappers is deployed, because the
individual mappers remain open to consume additional splits. The perfor-
mance cost of locking a new split is far less than the startup cost for a new map-
per, which accounts for a significant increase in performance. The left side of
Figure 5-11 shows the benchmark results for a set-similarity join workload,
which had high map task startup costs that were mitigated by the use of adap-
tive mappers. The adaptive mappers result (see the AM bar) was based on a
low split size of 32MB. Only a single wave of mappers was used, so there
were significant performance savings based on avoiding the startup costs for
additional mappers.
For some workloads, any lack of balance could get magnified with larger
split sizes, which would cause additional performance problems. When using
adaptive mappers, you can (without penalty) avoid unbalanced workloads by
tuning jobs to use a smaller split size. Because there will only be a single wave
of mappers, your workload will not be crippled by the mapper startup costs of
many additional mappers. The right side of Figure 5-11 shows the benchmark
results for a join query on TERASORT records, in which an imbalance
occurred between individual map tasks that led to an unbalanced workload
for the larger split sizes. The adaptive mappers result (again, see the AM bar)
was based on a low split size of 32MB. Only a single wave of mappers was
1000
1500
Regular Mappers
Adaptive Mappers
Regular Mappers
Adaptive Mappers
800
1200
600
900
400
600
200
300
0
0
Split Size (MB)
Split Size (MB)
Figure 5-11 Benchmarking a set-similarity join workload with high map task startup
costs reduced through the use of adaptive mappers
Search WWH ::




Custom Search