Graphics Reference
In-Depth Information
manner. In other words, they all are updated in the same fashion, so there is no need to
know precisely which particle is being processed by a particular thread. This is a funda-
mental difference between the fluid simulation and the particle system—the fluid simula-
tion columns all rely on information from their neighbors, which requires that each column
know its exact location within the simulation. The unordered nature of the particle system
makes it particularly well suited to using append and consume buffers to hold them.
With this in mind, we will use two structured buffer resources to contain our particle
system. One will hold the current particle data, and the other will receive the updated par-
ticle data after an update sequence is performed. Figure 12.11 depicts how these resources
are used to provide and then receive the particle data. After an update pass is performed,
the buffer that received the updated data will then become the current state of the simula-
tion, and the process can be repeated in the next simulation step to refill the other buffer.
Before a simulation update is performed, the current state buffer holds all of the particle
data while the other buffer is empty. During processing, the elements are consumed from
the current state buffer, and then appended to the new state buffer. After processing, the
roles are reversed, and the new state buffer holds all of the particle data.
Threading Scheme
Now that we have chosen our resource model, we must consider how we will invoke an
appropriate number of threads to process these data sets. As we described above, the par-
ticles do not need to communicate with one another at all, which means there is no need
to use the group shared memory. This lets us choose a threading orientation based solely
on ensuring that an appropriate number of threads are instantiated to process all of the par-
ticles that are currently active. The upper limit to the number of particles in the system is
dependent on the size of the buffer resources that are created to hold their data. The number
of particles can also vary downward to a smaller number if particles have died off or are
destroyed in the black hole.
This seems to present a problem. The number of particles can change from simula-
tion step to simulation step, while the number of active particles is available in the buffer
resource. However, the CPU is in control of specifying how many threads to instantiate
through its dispatch methods. There is a device context method for reading the element
count out of the buffer resource, but this count is copied into another buffer resource. If the
CPU tried to map the secondary buffer into system memory, it would require quite a sig-
nificant delay to copy the data back to the CPU for reading. That would negate the benefits
of having the fast GPU-based implementation, since we would be synchronizing data to the
CPU in every frame. So instead of trying to read back the number of particles, we can take
a more conservative approach with our thread invocation. We can estimate the number of
particles in the system, based on the properties of the creation and destruction mechanisms
that are used, and then round up to the next higher number of threads specified by our
Search WWH ::




Custom Search