Hardware Reference
In-Depth Information
independently of the other input VC. Therefore, as B0 is granted in SA and departs
in cycle 3, H1 manages to allocate an output VC in the same cycle. Since H1 requests
in VA any of the available output VCs, its destined output port can be the same with
that of packet 0 without any conflict. In the next cycle, it may participate in SA and
effectively win a grant.
In cycle 4, T0 participates in SA but loses and remains in place. This does not
constitute a pipeline bubble, since the input is able to transmit a flit to the output
(flit H1). In cycle 5, T0 retries and now wins over packet B1 that arrived in the
meantime. Observing the input port's incoming and outgoing traffic, one would see
no idle cycles between the flits of the two packets. Any idle cycle, would appear if
both packets strictly requested the same output VC of the same output port. In this
case, the flits of the second packet should wait until the T0 leaves the router and
releases the allocated output VC.
9.3.3
Obstactes in Removing the Deficiency of the VA Pipeline
Stage
The idle cycles that appear after adding the pipeline registers at the end of VA in
the control path of the router can be eliminated by pipelining also the datapath of
the router similar to the organization presented in the pipeline of the RC stage in
Sect. 9.2.2 . However, such an addition would cause dependencies across VCs that
may lead to a deadlock condition.
For example, if we followed this strategy of adding an intermediate EB in the
datapath, when a tail flit present at the intermediate EB performs SA, a head flit of
another packet behind it (at the frontmost position of the input VC buffer) would
be requesting an output VC. Assume that the tail flit owns output VC#1 and tries
to gain access to it through SA, while the head flit successfully acquired in VA
VC#0. If the tail flit fails to move forward, either due to lack of credits, or simply
because it lost in SA, the same input VC will be found owning at the same time
two output VCs. This scenario creates a dependency across VCs: Output VC#0
cannot be accessed until output VC#1 is released. Since the two output VCs may
belong to different output ports, the dependency may even affect different routers.
To avoid such dependencies, requires atomic buffer allocation, i.e., a new head flit
arrives at an input VC buffer when the tail of the previous packet has departed (see
Sect. 3.1.2 ). However, complying to this requirement would lead to exactly the same
performance, as in the previous case, rendering the intermediate EB in the data path
redundant.
Thus, pipelining VA from switch allocation and traversal is performed only in
the control path and imposing an idle cycle between (a) two packets that arrive
contiguously to the same input VC, or between (b) two packets that are heading to
the same output VC, irrespective of the input VC they belong to.
Search WWH ::




Custom Search