Digital Signal Processing Reference
In-Depth Information
TABLE 3.6
Pipelining with Stalling Effects
Clock Cycle
1
2
3
4
5
6
7
8
9
10
11
12
PG
PS
PW
PR
DP
DC
E1
E2
E3
E4
E5
E6
DP
DC
E1
E2
E3
E4
E5
DP
DC
E1
E2
E3
E4
PG
PS
PW
PR
X
X
DP
DC
E1
E2
E3
PG
PS
PW
X
X
PR
DP
DC
E1
E2
PGPS
X
X
PWPR
P
CE1
PG
X
X
PS
PWPR
DP
DC
X
X
PG
PS
PWPR
DP
that instruction is in parallel with a subsequent instruction (if a 1, as shown in Figure
3.3). With a 0 in the LSB of an instruction, the chain is broken, and the subsequent
instructions are placed in the next execute packet.
During clock cycles 1 through 4, a program fetch phase occurs. The three EPs
within the same FP cause a stall in the pipeline. This allows the DP phase to start
at cycle 6 (not at cycle 5) for EP2 and at cycle 7 for EP3. The subsequent FP (FP2)
with only one EP (with all eight instructions in parallel) is stalled so that each of
the three EPs in the previous FP (FP1) can go through the DP phase. As a result,
while the fetch phase for FP2 starts at cycle 2, its DP phase does not start until cycle
8. The third FP (FP3), also with only one EP, starts its fetch stage at cycle 3, but its
DP phase does not start until cycle 9, due to the pipeline stall.
The pipeline then stalls in cycles 6 and 7, as indicated with an “X.” Once EP3
(within FP1) continues onto its decoding phase in cycle 8, the pipeline is released.
FP2 can now continue to its DP phase in cycle 8. Since FP3 through FP6 also were
stalled, each can now resume its program fetch phase in cycle 8.
Hence, with the three EPs within one FP, the pipeline stalls for two cycles. Table
3.6 illustrates the stalling pipeline effects. A pipeline stall would also take place if
the first FP had four EPs, each with two parallel instructions.
3.21 PROGRAMMING EXAMPLES USING C, ASSEMBLY, AND
LINEAR ASSEMBLY
Several programming examples are discussed in this section. The first example illus-
trates use of the intrinsic function _nassert to increase the efficiency of the dot
product in Example 1.3. The remaining examples illustrate both assembly code and
linear assembly code implementation: a C program calling an assembly function,
a C program calling a linear assembly function, and an assembly-coded program
calling an assembly-coded function. The focus here is on illustrating the syntax of
both assembly and linear assembly code, not necessarily to produce optimized code.
Search WWH ::




Custom Search