Graphics Reference
In-Depth Information
13.7.2 Four Spheres Versus Four AABBs SIMD Test
A sphere-AABB test can be implemented in a similar manner. As before, let the four
spheres be given by the 4-tuples ( x i , y i , z i , r i ). Let the four AABBs be given by the
opposing minimum and maximum corners (min[i]x, min[i]y, min[i]z) and (max[i]x,
max[i]y, max[i]z) . By first defining the symbolic SIMD registers
MINX = min1x | min2x | min3x | min4x
SX = x1 | x2 | x3 | x4
MINY = min1y | min2y | min3y | min4y
SY = y1 | y2 | y3 | y4
MINZ = min1z | min2z | min3z | min4z
SZ = z1 | z2 | z3 | z4
MAXX = max1x | max2x | max3x | max4x
SR = r1 | r2 | r3 | r4
MAXY = max1y | max2y | max3y | max4y
MAXZ = max1z | max2z | max3z | max4z
the test is then implemented by the following pseudocode:
MAX TX,SX,MINX
; TX = Max(SX,MINX)
Find point T = (TX, TY, TZ) on/in AABB, closest
MAX TY,SY,MINY
; TY = Max(SY,MINY)
to sphere center S. Computed by clamping sphere
MAX TZ,SZ,MINZ
; TZ = Max(SZ,MINZ)
center to AABB extents.
MIN TX,TX,MAXX
; TX = Min(TX,MAXX)
MIN TY,TY,MAXY
; TY = Min(TY,MAXY)
MIN TZ,TZ,MAXZ
; TZ = Min(TZ,MAXZ)
SUB DX,SX,TX
; DX = SX - TX
D=S-Tisvector between S and clamped center T
SUB DY,SY,TY
;DY=SY-TY
SUB DZ,SZ,TZ
;DZ=SZ-TZ
Finally compute Result = Dot(D, D) <= SR 2,
MUL R2,SR,SR
; R2 = SR * SR
MUL DX,DX,DX
; DX = DX * DX
where SR is sphere radius. (To reduce the latency
MUL DY,DY,DY
; DY = DY * DY
of having two sequential additions in the dot
product, move DZ 2 term over to right-hand side
MUL DZ,DZ,DZ
; DZ = DZ * DZ
of comparison and subtract it off SR 2 instead.)
ADD T1,DX,DY
; T1 = DX + DY
SUB T2,R2,DZ
;T2=R2-DZ
LEQ Result,T1,T2
; Result = T1 <= T2
This code is 16 instructions long and an instruction throughput/latency of 1/4 cycles
gives the final result in 23/26 cycles.
13.7.3 Four AABBs Versus Four AABBs SIMD Test
For performing four AABB-AABB tests in parallel, let eight AABBs be
given by opposing minimum and maximum corners (min[i]x, min[i]y, min[i]z)
 
Search WWH ::




Custom Search