Graphics Reference
In-Depth Information
a [10,10,10,10] resource, we would create a buffer resource with 10*10*10*10 = 10,000
elements. Then we would modify our numthreads statement to use a size of [10,10,10],
and use a dispatch call of size [10,1,1]. Listing 5.4 demonstrates how the shader would
be modified to calculate the index to lookup in the buffer for each element.
Buffer<float> InputBuf : register( t0 );
RWBuffer<float> OutputBuf : register( u0 );
// Group size
#define size_x 10
#define size_y 10
#define size_z 10
#define size_w 10
// Declare one thread for each texel of the input texture.
[ numthreads (size_x, size_y, size_z)]
void CSMAIN( uint3 DispatchThreadID : SV_DispatchThreadID, uint3 GroupID :
SV_GroupID )
{
int index = DispatchThreadID.x +
DipsatchThreadlD.y * size_x +
DipsatchThreadlD.z * size_x * size_y +
GroupID.x
* size_x * size_y * size_z +
float Value = InputBuf .Load( index );
OutputBuf[index] = 2.9f * Value;
}
Listing 5.4. A sample compute shader for doubling the contents of a custom 4D resource.
Here we simply change our resource access code to use each thread's SV_GroupID
system value as the fourth-dimension index, and then calculate a linear address for that
particular element's location in the buffer resource. The important consideration here is
that how a resource is accessed does not need to be simpleā€”it is quite possible to calculate
arbitrary memory locations within a resource, which provides significant freedom for the
developer to implement the desired access patterns.
5.2.4 Thread Execution Patterns
Throughout this section, we have seen how the threading model of the compute shader
functions from the developer's perspective. However, it is important to understand that
Search WWH ::




Custom Search