Information Technology Reference
In-Depth Information
skip the next N items again. The stripe algorithm is easiest to understand
if you imagine a stripe of 1 item. In the case of four worker tasks, one task
gets the items at indices 0, 4, 8, 12, and so on. The second task gets items
at indices 1, 5, 9, 13, and so on. Striped partitions avoid any interthread
synchronization to implement TakeWhile() and SkipWhile() for the entire
query. Also, it lets each worker thread move to the next items it should
process using simple arithmetic.
The final algorithm is a Hash Partitioning. Hash Partitioning is a special-
purpose algorithm designed for queries with the Join, GroupJoin, GroupBy,
Distinct, Except, Union, and Intersect operations. Those are more expen-
sive operations, and a specific partitioning algorithm can enable greater
parallelism on those queries. Hash Partitioning ensures that all items gen-
erating the same hash code are processed by the same task. That minimizes
the intertask communications for those operations.
Independent of the partitioning algorithm, there are three different algo-
rithms used by PLINQ to parallelize tasks in your code: Pipelining, Stop
& Go, and Inverted Enumeration. Pipelining is the default, so I'll explain
that one first. In pipelining, one thread handles the enumeration (the
foreach , or query sequence). Multiple threads are used to process the
query on each of the elements in the sequence. As each new item in the
sequence is requested, it will be processed by a different thread. The num-
ber of threads used by PLINQ in pipelining mode will usually be the
number of cores (for most CPU bound queries). In my factorial example,
it would work with two threads on my dual core machine. The first item
would be retrieved from the sequence and processed by one thread. Imme-
diately the second item would be requested and processed by a second
thread. Then, when one of those items finished, the third item would be
requested, and the query expression would be processed by that thread.
Throughout the execution of the query for the entire sequence, both
threads would be busy with query items. On a machine with more cores,
more items would be processed in parallel.
For example, on a 16 core machine, the first 16 items would be processed
immediately by 16 different threads (presumably running on 16 different
cores). I've simplified a little. There is a thread that handles the enumera-
tion, and that often means Pipelining creates (Number of Cores + 1)
threads. In most scenarios, the enumeration thread is waiting most of the
time, so it makes sense to create one extra.
 
Search WWH ::




Custom Search