Hardware Reference
In-Depth Information
VEC output
Input-A
Input-B
Input-C
Exp.
Diff.
Type
Check
Exp.
Adder
MUX
Multiplier Array
Aligner
EX
FDS
output
Compare
MUX
Carry
Propagate
Adder
(CPA)
Leading
Non-Zero
Detector
(LNZ)
Feedback
path
for Double
T-bit
MA
Adjuster
Mantissa Normalizer
Exp.
Rounder
WB
Mantissa Rounder
MAIN output
Fig. 3.22
Structure of FPU MAIN block
There are two FMAC definitions. One calculates a sequence of FMUL and FADD
and is good for conforming the ANSI/IEEE standard, but requires extra normaliza-
tion and rounding between the multiply and add. The extra operations require extra
time and causes inaccuracy. The other calculates an accurate multiply-and-add
value, then normalizes and rounds it. It was not defined by the standard at that time,
but now, it is in the standard. The SH-4 adopted the latter fused definition.
The FMAC processing flow is as follows. At the EX stage, Exp. Diff. and Exp.
Adder calculates an exponent difference of “A” and “B*C” and an exponent of B*C,
respectively, and Aligner aligns “A” according to the exponent difference. Then the
Multiplier Array calculates a mantissa of “A + B*C.” The “B*C” is calculated in
parallel with the above executions, and the aligned “A” is added at the final reduc-
tion logic. At the MA stage, CPA adds the Multiplier Array outputs, LNZ detects
the leading nonzero position of the absolute value of the CPA output from the
Multiplier Array outputs in parallel with the CPA calculation, and Mantissa
Normalizer normalizes the CPA outputs with the LNZ output. At the WB stage,
Mantissa Rounder rounds the Mantissa Normalizer output, Exp. Rounder normal-
izes and rounds the Exp. Adder output, and both the Rounders replace the rounded
result by the special result if necessary to produce the final MAIN block output.
Figure 3.23 illustrates the VEC block. The FTRV reads the inputs for four cycles
to calculate four transformed vector elements. This means the last read is at the forth
cycle, but it is too late to cancel the FTRV even the input value causes an exception.
Therefore, the VEC block must treat all the data types appropriately for the FTRV,
and all the denormalized numbers are detected and adjusted differently from the
normalized numbers. As illustrated in Fig. 3.23 , the VEC block can start the opera-
tion at the ID stage by eliminating the input operand forwarding, and the above
adjustment can be done at the ID stage.
 
Search WWH ::




Custom Search