Hardware Reference
In-Depth Information
without Special Inst. (FTRV, FIPR, FSRRA)
FSQRT
FDIV
Arithmetic
FMUL
FMAC
FDIV
FMUL
FMAC
FDIV
FSQRT
FDIV
FDIV/FSQRT
Coordinate & Perspective
Transformations
with Special Inst.
FSRRA
FIPR
FTRV
FMUL
Intensity Calculation
FMUL
63% shorter
Arithmetic
58% shorter
0
11
19
20
26
40
52
Resource-occupying cycles
Fig. 3.37
Resource-occupying cycles of SH-X for a 3D benchmark
V
V
V
′′
(, )
L N
y
V VV
′′
=
,
=
,
S
=
x
,
S
=
,
N NI
=
,
=
,
x
y
V
′′
V
V
(,
NN
)
W
z
z
TTTT
′′
⎛⎞
V
⎛⎞
V
⎛⎞
V
xx
xy
xz
xw
x
x
x
⎜⎟
⎜⎟
⎜⎟
TTTT
′′
V
V
V
yx
yy
yz
yw
⎜⎟
⎜⎟
y
⎜⎟
y
y
T
=
,
V
=
,
V
=
,
V
′′
=
,
⎜⎟
⎜⎟
TTTT
⎜⎟
′′
V
V
V
zx
zy
zz
zw
z
z
z
⎜⎟
⎜⎟
⎜⎟
′′
TTTT
⎝⎠
1
⎝⎠
1
⎝⎠
V
w
wx
wy
wz
ww
⎛⎞
N
⎛⎞
N
⎛⎞
L
x
x
x
⎜⎟
⎜⎟
⎜⎟
N
N
L
⎜⎟
y
⎜⎟
⎜⎟
y
y
N
=
,
N
=
,
L
=
⎜⎟
⎜⎟
⎜⎟
N
N
L
z
z
z
⎜⎟
⎜⎟
⎜⎟
⎝⎠
0
⎝⎠
0
⎝⎠
0
The coordinate and perspective transformations require 7 FMULs, 12 FMACs,
and 2 FDIVs without special instructions (FTRV, FIPR, and FSRRA) and 1 FTRV,
5 FMULs, and 2 FSRRAs with special instructions. The intensity calculation
requires 1 FMULs, 12 FMACs, 1 FSQRT, and 1 FDIV without special instructions
and 1 FTRV, 2 FIPRs, 1 FSRRA, and 1 FMUL with special instructions.
Figure 3.37 illustrates the resource-occupying cycles of the 3D graphics bench-
mark. After program optimization, no register conflict occurs, and performance is
restricted only by the floating-point resource-occupying cycles. The gray areas of
the graph represent the cycles of the coordinate and perspective transformations.
Without the special instructions, the FDIV/FSQRT resources are occupied for the
longest cycles, and these cycles determine the number of execution cycles, that is,
26. Using the special instructions enables some of these instructions to be replaced.
In this case, the arithmetic resource-occupying cycles determine the number of exe-
cution cycles, that is, 11, which are 58% shorter than when special instructions are
 
Search WWH ::




Custom Search