Hardware Reference
In-Depth Information
0
1
2
3
4
5
6
cc
H0
H1
LT - BW
RC-VA-SA-DQ-ST
LT - BW
su
cc
LT - BW
RC - VA
RC-VA-SA-DQ-ST
LT - BW
su
cc
SA-DQ-ST
B0
LT - BW
SA
LT - BW
cc
su
B1
...
...
...
LT - BW
SA
SA-DQ-ST
LT - BW
cc
su
T0
T1
LT - BW
SA
SA-DQ-ST
LT - BW
SA
Fig. 9.3 An example of the operation of a single-cycle VC-based router receiving the flits of two
packets in consecutive cycles that arrive at the same input but belong to different VCs
The index next to each flit denotes different packet and thus, different input VCs
(e.g. H0 is a head flit of packet 0 at input VC#0, B1 is a body flit of packet 1, at
input VC#1).
The head flit of the first packet arrives in cycle 0, and in cycle 1 executes
successfully all the needed tasks. In parallel, the head flit of the next packet arrives
and is stored in the input VC#1 buffer. In cycle 2, H1 performs RC and VA but fails
to allocate an output VC. In parallel, the body flit for packet 0 arrives, and it will
appear in the frontmost position of input VC#0 buffer in the next cycle. In cycle 3,
H1 retries for VA and acquires successfully an output VC that allows it to participate
in SA. B1 participates also in SA in the same cycle. The priority of the arbiter in
SA1 points to H1 after the grant given to H0 in cycle 1. Therefore, B0 loses and
H1 is promoted to SA2. H1 is granted in SA2 as well, that allows it to move to its
destined output port after consuming the necessary credit. The loser flit B0 retries
in SA in cycle 4, and wins over B1 that arrived in the meantime, following the same
procedure as before to leave the router.
Unless none of the allocated output VCs are stalled, the flit flow carries on in a
similar manner without any interruption. The only cases under which an input port
is left unused is when (a) an input VC fails to succeed in VA, (b) a flit loses in SA2,
meaning that a different input utilizes the same output, or (c) the assigned output
VC is left without credits.
9.2
The Routing Computation Pipeline Stage
The control path of any VC-based router that does not employ lookahead techniques,
begins with routing computation that computes for each packet the destined output
port. Routing computation may just pick the appropriate output port for the packet
and let it select any of the available output VCs, or it may be more restrictive and
guide also output VC selection, by restricting the output VCs that the packet can
request. Pipelining RC from the rest tasks of the router's control path follows the
same approach as in the case of wormhole routers. The first option involves the
isolation of RC only in the control path that inevitably introduces idle cycles in the
Search WWH ::




Custom Search