Graphics Reference
In-Depth Information
This attribute defines a thread group, also known as a kernel. In Listing 9.10 it is de-
fining a 16x16xl 7 array of threads per group. The body of the csMain method is executed
for a single thread, but through system generated values, it is able to identify which of these
256 (16x16x1) threads this actually is. Because it is know which thread this is, the code can
be written to ensure that each thread reads from and writes to the correct location.
In Figure 9.16 the Dispatch(x, y , z) call is also introduced. This is made by the
application and is analogous to a draw call as it begins execution of the compute shader. At
this level, the parameters indicate how many groups of 16x16xl thread groups to create.
For this particular algorithm, the application simply divides the input height map texture
dimensions by 16 and uses this as the number of kernels. 8
For example, for a 1024x1024 height map, there will be 64x64 kernels, each kernel
being 16x16x1 threads. Conceptually, this would imply a very large number of threads,
one per pixel in this case, but it is up to the implementation how these tasks will be sched-
uled on the GPU and how many actually execute concurrently.
A key detail that has been omitted until now is how an invocation can identify itself
relative to its group, as well as the entire dispatch call. Direct3D defines four system-
generated values for this purpose:
1.
SV_GroupID
This uint3 returns indexes into the parameters provided by ID3D11Device
Context:: Dispatch (). It allows this invocation to know which group this is,
relative to all others being executed. In this algorithm, it is the index into the
output texture where the results for the whole group are written.
2.
SV_GroupThreadID
This uint3 returns indexes local to the current thread group—the parameters pro-
vided at compile-time as part of the [numthreads()] attribute. In this algorithm,
it is used to know which threads represent corner pixels for the current 16x16
area.
3.
SV_DispatchThreadID
This uint3 is a combination of the previous two. Whereas they index relative
to only one set of input parameters (either ::Dispatch() or [numthreads()]),
this is a global index, essentially the two axes multiplied together. For a 64x64x1
dispatch of l6xl6xl threads, this system value will vary between 0 and 1023 in
both axes (64*16=1024). Thus, for this algorithm, it provides the thread with the
address of the source pixel to read from.
7 The API expects (X,Y,Z) thread group notation, which is maintained in this text (and introduced in Chap-
ter 5]—even if Z= 1 is not actually significant to this implementation.
8 Multiples of 16 were chosen for convenience in the sample code; this can be changed, as appropriate, for
different-sized terrains.
Search WWH ::




Custom Search