Graphics Reference
In-Depth Information
A measure of performance of a constraint solver is the total number of batches
created by a batching method. As the number of batches increases, the number
of serial computations and synchronization increases, which results in longer con-
straint solving time. When a rigid body in the simulation is connected by n
constraints, at least n batches are necessary to batch these constraints because
all of them must be executed in different batches. Thus, the number of batches
created by global batching is more than n global
,n m ) for a sim-
ulation with m rigid bodies where n i is the number of constraints connected to
body i . When constraints are solved by the two-level constraint solver with local
batching, it splits constraints into four constraint sets. The number of batches
required for the set i is n local i
max =max( n 0 ,n 1 ,
···
max =max( n 0 ,n 1 , ··· ,n i m i ). Constraint sets must be
processed sequentially; thus, the number of batches for the two-level constraint
solver is 0 n local i
max .
If all the constraints of the rigid body having n global
max constraints belong to
constraint set 0, n local 0
max = n global
max . Therefore, the total number of batches of the
two-level constraint solver cannot be less than the number of batches created by
global batching ( 0 n local i
n local 0
max = n global
max ).
If a simulation is executed on a highly parallel processor like a GPU, a large
number of constraints can be solved at the same time. If the number of con-
straints in a batch is less than the hardware parallel width, the solving time for
each batch should be roughly the same. Therefore, the number of batches can be
an estimation of the computation time of a constraint solver. From the compar-
ison of the number of batches, the two-level constraint solver cannot, in theory,
outperform a parallel constraint solver using the optimal global batching.
max
4.6 Results and Discussion
The presented method is implemented on the GPU and constraints in six bench-
mark scenes (shown in Figure 4.5) are solved using an AMD Radeon HD7970
GPU. Table 4.1 shows the data from those simulations. We also implemented the
global constraint solver, which uses the CPU for global batching and the GPU
for solving constraints, to evaluate the performance of the two-level constraint
solver. In our implementation, constraints are not stored one by one; instead, up
to four constraints for a colliding pair are stored as a constraint pair.
4.6.1 Evaluation of Global Split
The space is split into 64
64 for a global split for all benchmarks. The number of
splits is the only parameter we need to specify to implement our constraint solver,
other than the physical parameters for a constraint solver. It is not likely that
rigid bodies are distributed evenly in the simulation space (i.e., it is unlikely that
all cells are populated in a simulation). Therefore, we want to set the number of
×
Search WWH ::




Custom Search