Hardware Reference
In-Depth Information
Forwarding
Register Read
Register Read
ID
ID
EX
E0
FLS
FDS
VEC
MAIN
MA
EX
Register
Write
WB
LS
FE
Fig. 3.21
Pipeline structure of SH-4 FPU
A floating-point load/store block (FLS) is the main part of the LS pipeline. At the EX
stage, it outputs a store data for the FMOV with a store operation, changes a sign for the
FABS and FNEG, and outputs an on-the-fly data for the forwarding. At the MA stage, it
gets a load data for the FMOV with a load operation and outputs an on-the-fly data for
the forwarding. It writes back the result in the middle of the WB stage at the negative
edge of the clock pulse. Then the written data can be read on the latter half of the ID
stage, and no forwarding path form the WB stage is necessary.
The FE pipeline consists of three blocks of MAIN, FDS, and VEC. An E0 stage
is inserted to execute the vector instructions of FIPR and FTRV. The VEC block is
the special hardware to execute the vector instructions of FIPR and FTRV, and the
FDS block is for the floating-point divide and square-root instructions (FDIV and
FSQRT). Both the blocks will be explained later. The MAIN block executes the
other FE-category instructions and the postprocessing of all the FE-category ones.
The MAIN block executes the arithmetic operations for two and half cycles of the
EX, MA, and WB stages.
Figure 3.22 illustrates the structure of the MAIN block. It is constructed to exe-
cute the FMAC, whose three operands are named A, B, and C, and a formula
A + B × C is calculated. Other instructions of FADD, FSUB, and FMUL are treated
by setting one of the inputs to 1.0, −1.0 or 0.0 appropriately.
A floating-point format includes special numbers of zero, denormalized number,
infinity, and not a number (NaN) as well as a normalized number. The inputs are
checked by Type Check part, and if there is a special number, a proper special-
number output is generated in parallel with the normal calculation and selected at
Rounder parts of the WB stage instead of the calculation result.
The compare instructions are treated at Compare part. The comparison is simple
like an integer comparison except for some special numbers. The input check result
of the Type Check part is used for the exceptional case and selected instead of the
simple comparison result if necessary. The final result is transferred to EX pipeline
to set or clear the T-bit according to the result at the MA stage.
Search WWH ::




Custom Search