Hardware Reference
In-Depth Information
credit
update
outAvailable[0...N-1]
pipeline
register
outPort
ready out[0...N-1]
CC
RC
SU
en
req
outLock
1
SA
dst
en
head
credit update
granted
valid
data
valid
data
ST
Input #i
Output #j
Fig. 5.4 The organization of the router that pipelines the RC stage of the control path from SA
and ST. The outPort state variable that holds the output port requests of each packet acts as the
pipeline register
The only difference compared to the un-pipelined version lies around the outPort
register of each input. In the single-cycle organization this register is bypassed via
a multiplexer when a head flit appears at the frontmost position of the input buffer.
This bypass is necessary for allowing the head flit to generate the requests to the
SA in the same cycle. In the RC pipelined organization this multiplexer is removed
allowing the outPort register to play the role of the pipeline register in the control
path that separates RC from SA and ST. In both cases, the outPort register is set
(storing the output port request of the corresponding packet), when the head flit
of the packet appears at the frontmost position of the input buffer ( isHead .Q/ D
true ), and it resets when the tail flit of the packet is dequeued from the input buffer
( isTail .Q/ D
true and granted ).
Using this organization the critical path of the router is reduced by the delay
of the RC unit and in most tested configurations starts from the outPort register,
passes through the request generation logic and arbitration and ends up at the
output pipeline register. Please note that, since now the delay of the control path
is shortened, depending on the exact delay profile of the pipelined control path, the
critical path of the router may migrate from the control path and move to the data
path of the design.
The cycle-by-cycle execution of the RC control pipelined version of the router is
shown in Fig. 5.5 . In cycle 0 the head flit of a packet arrives at an input and is stored
in the input buffer (BW). Then in cycle 1 the head flit performs RC and stores the
output port requests of its packet to the outPort pipeline register. In parallel a body
flit arrives at the same input. During cycle 2 the head flit performs SA and, assuming
that it is successful, it dequeues (DQ) itself from the input buffer and moves to the
crossbar that implements ST. In parallel to SA, CC and SU operations take place,
consuming a credit and lowering the outAvailable flag. The body flit that arrived at
 
Search WWH ::




Custom Search