Database Reference
In-Depth Information
Figure 25-16. Parallelism: Gather Streams
Finally, in Repartition Streams mode, the Parallelism operator accepts data from multiple producer threads and
distributes it across multiple consumer threads. This happens in the middle of a parallel zone of the plan when the
data needs to be redistributed between execution threads. Figure 25-17 illustrates this concept.
Figure 25-17. Parallelism: Repartition Streams
There are several different ways that data can be distributed between consumer threads. Table 25-4 summarizes
these methods.
Table 25-4. Data redistribution methods in parallelism
Redistribution Method
Description
Broadcast
Send row to all consumer threads
Round Robin
Send row to the next consumer thread in sequence
Demand
Send row to the next consumer thread that requests the row
Range
Use range function to determine which consumer thread should get a row
Hash
Use hash function to determine which consumer thread should get a row
The Parallelism operator uses a different execution model than other operators. It uses a push-based model,
with producer threads pushing rows to it. It is the opposite of a pull-based model, where the parent operator calls the
GetRow() method of a child operator to get the data.
Evenly distributed workload is the key element for good performance of parallel execution plans. You can see
the number of rows processed by each thread in the “Actual Number of Rows” section of the Properties window in
Management Studio. That information is not displayed in a tool-tip in the graphical execution plans. Thread 0 is the
parallelism management thread, which always shows zero as number of rows.
Uneven data distribution and outdated statistics are common causes of uneven workload distribution between
threads. Figure 25-18 shows how workload distribution changes after a statistics update on one of the tables. The left
side shows the distribution before the statistics update and the right side shows it after the update.
 
 
Search WWH ::




Custom Search