Hardware Reference
In-Depth Information
of the same input VC buffer. Once the head flit leaves the EB, it is replaced by the
body flit in cycle 2. The same procedure is repeated for the next flits of the same
packet. In cycle 3, when the tail flit moves to the intermediate EB, the frontmost
position of the input VC buffer holds the head flit of the next packet that is free to
perform RC, while the tail flit performs SA. Therefore, once the tail flit is dequeued
from the EB, after releasing any resources, the head flit of the next packet can be
immediately engaged in VA.
Please notice that, if the tail flit had lost in SA, it would continue occupying
the EB. This condition should prohibit the following head flit to update the outPort
register with the new result of RC, since the old output port id stored in outPort Œi
is needed by the tail flit to retry in SA in the next cycle. Although the input VC
would remain idle for one or more cycles, this would not be a result of the pipeline
configuration but of the current output contention of the router.
9.3
The VC Allocation Pipeline Stage
For the baseline router configuration, VC allocation is performed in series to RC,
and SA cannot begin before VA has produced a result. Depending on the router
configuration, the VC allocator of a single-cycle router may contribute to the critical
path as much as half of the total delay, especially, due to the second allocation stage
(VA2) that consists of N V W 1 arbiters. Apart from the architectural modifications
presented in Chap. 8 that try to “hide” the delay of VA, or even remove it completely,
the overhead of VA can be alleviated by isolating its operation in a different pipeline
stage from SA and ST.
As can be seen by the single-cycle organization of Fig. 9.1 , VA contributes to the
router's control path only for the part concerning the assembly of SA requests. A
flit is allowed to issue a request to SA, even if it has just allocated an output VC
in the same cycle. This feature adds some bypass logic at the output of outVC and
outVCLock registers respectively, that allow a head flit to use the result of VA in
the same cycle. Following a similar approach to the pipelining of the RC stage, the
control path can be cut off at this point, simply by removing those bypass paths. The
resulting router organization after pipelining the control path at the end of the VA
stage is presented in Fig. 9.8 .
The derived two-stage pipelined router's control path is now split in two parts.
The first part begins with RC and ends up at the per-input VC registers that store
the VA result, while the second part starts with the request generation for SA and
may end up at three possible points after passing throughput ST: (a) the update of
the ouVCAvailable flags-SU, (b) the update of the credit counters per output VC -
CC or (c) the output pipeline register.
Search WWH ::




Custom Search