Graphics Reference
In-Depth Information
00011222
0011122
000011112
Figure 4.3. Constraints in a group are stored in a contiguous memory. Each group has
the offset to the memory and number of constraints. Local batching sorts constraints
for a group by batch index using a SIMD.
4.3.2 Local Batching
After the global split, constraint groups are independent in a set. In other words,
groups in a set can be processed in parallel. A group is processed by a SIMD of
the GPU. There are, however, still dependencies among constraints in a group.
Therefore, batching is necessary to process them in parallel in each group. We call
this local batching , in contrast to the global batching described in Section 4.2.2,
because batching for this step must consider only the connectivity of constraints
in a group that consists of localized constraints.
One big difference between global and local batching is the width of the batch
we must create. For global batching, the optimal batch width is the hardware
computation width. For a GPU, which executes thousands of work items concur-
rently, we have to extract thousands of independent constraints to fill the hard-
ware. However, because we assign a SIMD to solve constraints in a group, local
batching requires creating a batch only with the SIMD width (e.g., on a GPU with
a 64-wide SIMD, only 64 independent constraints have to be extracted). Creat-
ing a narrower batch is easier and computationally cheaper because the wider the
batch, the more expensive the dependency check of constraints becomes.
The input of local batching is a set of constraints. Batches are created at this
stage and batch index is calculated for each constraint. For the convenience of the
constraint solver, constraints are sorted by batch indices. Figure 4.3 illustrates
how constraints are stored.
Search WWH ::




Custom Search