Graphics Reference
In-Depth Information
and (max[i]x, max[i]y, max[i]z) , for 1
i
8.
Defining the symbolic SIMD
registers
MIN1X = min1x | min2x | min3x | min4x
MIN2X = min5x | min6x | min7x | min8x
MIN1Y = min1y | min2y | min3y | min4y
MIN2Y = min5y | min6y | min7y | min8y
MIN1Z = min1z | min2z | min3z | min4z
MIN2Z = min5z | min6z | min7z | min8z
MAX1X = max1x | max2x | max3x | max4x
MAX2X = max5x | max6x | max7x | max8x
MAX1Y = max1y | max2y | max3y | max4y
MAX2Y = max5y | max6y | max7y | max8y
MAX1Z = max1z | max2z | max3z | max4z
MAX2Z = max5z | max6z | max7z | max8z
allows the tests to be performed by the following pseudocode.
MAX AX,MIN1X,MIN2X
; AX = Max(MIN1X,MIN2X)
Compute the intersection volume of the
MIN BX,MAX1X,MAX2X
; BX = Min(MAX1X,MAX2X)
two bounding boxes by taking the maximum
MAX AY,MIN1Y,MIN2Y
; AY = Max(MIN1Y,MIN2Y)
value of the minimum extents and the
MIN BY,MAX1Y,MAX2Y
; BY = Min(MAX1Y,MAX2Y)
minimum value of the maximum extents.
MAX AZ,MIN1Z,MIN2Z
; AZ = Max(MIN1Z,MIN2Z)
MIN BZ,MAX1Z,MAX2Z
; BZ = Min(MAX1Z,MAX2Z)
LEQ T1,AX,BX
; T1 = AX <= BX
If the intersection volume is valid (if
LEQ T2,AY,BY
; T2 = AY <= BY
the two AABBs are overlapping) the
LEQ T3,AZ,BZ
; T3 = AZ <= BZ
resulting minimum extents must be smaller
AND T4,T1,T2
; T4 = T1 && T2
than the resulting maximum extents.
AND Result,T3,T4
; Result = T3 && T4
This code is 11 instructions long and an instruction throughput/latency of 1/4 cycles
gives the final result in 16/19 cycles.
13.8 Branching
Branching can be a performance issue for modern CPUs. An instruction, when exe-
cuted by the CPU, is not executed all at once, but rather in several stages. The reason
is parallelism. To see why, consider an abstract architecture with four stages.
1. Fetch. During this stage, the instruction is fetched from memory.
2. Decode. At this point, the bits of the instruction are examined and control is routed
to the place of actual execution of the instruction.
3. Execute. The actual operation specified by the instruction is performed.
4. Store. The result of the instruction is committed to a register or to memory.
 
Search WWH ::




Custom Search