Pipelined Virtual-Channel-Based Routers - Microarchitecture of Network-on-Chip Routers

Hardware Reference

In-Depth Information

of the same input VC buffer. Once the head flit leaves the EB, it is replaced by the

body flit in cycle 2. The same procedure is repeated for the next flits of the same

packet. In cycle 3, when the tail flit moves to the intermediate EB, the frontmost

position of the input VC buffer holds the head flit of the next packet that is free to

perform RC, while the tail flit performs SA. Therefore, once the tail flit is dequeued

from the EB, after releasing any resources, the head flit of the next packet can be

immediately engaged in VA.

Please notice that, if the tail flit had lost in SA, it would continue occupying

the EB. This condition should prohibit the following head flit to update the outPort

register with the new result of RC, since the old output port id stored in outPort Œi

is needed by the tail flit to retry in SA in the next cycle. Although the input VC

would remain idle for one or more cycles, this would not be a result of the pipeline

configuration but of the current output contention of the router.

9.3

The VC Allocation Pipeline Stage

For the baseline router configuration, VC allocation is performed in series to RC,

and SA cannot begin before VA has produced a result. Depending on the router

configuration, the VC allocator of a single-cycle router may contribute to the critical

path as much as half of the total delay, especially, due to the second allocation stage

(VA2) that consists of N V W 1 arbiters. Apart from the architectural modifications

presented in Chap. 8 that try to “hide” the delay of VA, or even remove it completely,

the overhead of VA can be alleviated by isolating its operation in a different pipeline

stage from SA and ST.

As can be seen by the single-cycle organization of Fig. 9.1 , VA contributes to the

router's control path only for the part concerning the assembly of SA requests. A

flit is allowed to issue a request to SA, even if it has just allocated an output VC

in the same cycle. This feature adds some bypass logic at the output of outVC and

outVCLock registers respectively, that allow a head flit to use the result of VA in

the same cycle. Following a similar approach to the pipelining of the RC stage, the

control path can be cut off at this point, simply by removing those bypass paths. The

resulting router organization after pipelining the control path at the end of the VA

stage is presented in Fig. 9.8 .

The derived two-stage pipelined router's control path is now split in two parts.

The first part begins with RC and ends up at the per-input VC registers that store

the VA result, while the second part starts with the request generation for SA and

may end up at three possible points after passing throughput ST: (a) the update of

the ouVCAvailable flags-SU, (b) the update of the credit counters per output VC -

CC or (c) the output pipeline register.

Search WWH ::

Custom Search

Home