Information Technology Reference
In-Depth Information
Figure7.7: Bulk synchronous design pattern for a parallel program; preempt-
ing one processor can stall the synchronization after each step.
Figure7.8: Producer-consumer design pattern for a parallel program; pre-
empting one stage can stall the remainder.
Bulk synchronous delay. A common design pattern in parallel pro-
grams is to split work into roughly equal sized chunks; once each chunk
is done, the processors synchronize, waiting for every chunk to complete
before communicating their results to the next stage of the computation.
This bulk synchronous parallelism is easy to manage | each processor
Definition: bulk
synchronous
works independently, sharing its results only with the next stage in the
computation. Google MapReduce is a widely used bulk synchronous ap-
plication.
Figure 7.7 illustrates the problem with bulk synchronous computation un-
der oblivious scheduling. At each step, the computation is limited by the
slowest processor to complete that step. If a processor is preempted, its
work will be delayed, stalling the remaining processors until the last one
is scheduled. Even if one of the waiting processors picks up the preempted
task, a single preemption can delay the entire computation by a factor of
two, or possibly even more with cache effects. Since the application does
not know that a processor was preempted, it cannot adapt its decom-
position for the available number of processors, so each step is similarly
delayed until the processor is returned.
Producer-consumer delay. Some parallel applications use a producer-
consumer design pattern, where the results of one thread are fed to the
next thread, and the output of that thread is fed onward, as in Figure 7.8.
Preempting a thread in the middle of a producer-consumer chain can stall
all of the processors in the chain.
Critical path delay. More general parallel programs have a critical
path | the minimum sequence of steps for the application to compute its
Definition: critical path
result, illustrated in Figure 7.9. Work off the critical path can be done in
parallel, but its precise scheduling is less important. Preempting a thread
on the critical path, however, will slow down the end result. Although the
application programmer may know which parts of the computation are on
the critical path, with oblivious scheduling, the operating system will not;
it will be equally likely to preempt a thread on the critical path as off.
Preemption of lock holder. Many parallel programs use locks and
condition variables for synchronizing their parallel execution. Often, to
reduce the cost of acquiring locks, parallel programs will use a \spin-then-
wait" strategy | if a lock is busy, the waiting thread spin-waits briey for
 
Search WWH ::




Custom Search