Hardware Reference
In-Depth Information
In cycle 4, the body flit of the first packet is on the link, and the tail flit of the
same packet completes SA, CC and ST, releasing in parallel the output port (SU).
Ideally, the head flit of the second packet could have completed RC. However, in
the examined configuration of the RC pipeline, overlapping of RC with the tail's
SA operation is not allowed. The reason for this limitation is that the RC unit is fed
with the destination field of the head flit only when the head flit is at the frontmost
position of the input buffer. In cycle 4 the frontmost position is occupied by the tail
flit that will be dequeued at the end of the cycle and move to the output of the router.
Therefore, the head flit of the second packet can feed the RC unit with the necessary
info not earlier than cycle 5, e.g., when the tail flit is already on the link.
This bubble in the RC control pipeline will appear in any case that two different
packets arrive at the same input back-to-back in consecutive cycles and it occurs
only after the end of the first packet. Packets from different inputs are not affected.
For example, when the tail flit of a packet from input i is leaving from output k,it
does not impose any idle cycle to a packet from input j that allocates output k in
the next cycle.
Therefore, RC for the head flit of the second packet is completed in cycle 5 and
the flow of flits in the pipeline continue the same way as before in the following
cycles.
5.2.1
Idle-Cycle Free Operation of the RC Pipeline Stage
The bubble appearing in the RC control pipeline is an inherent problem of the
organization of the router that does not allow the control information carried over
by flits, not in the frontmost position of the buffer, to initiate the execution of a task,
such as RC, in parallel to the tasks executed for the flit that occupies the frontmost
position of the input buffer. In the RC control pipeline the information of both the
frontmost and the second frontmost position would have been required to eliminate
idle cycles across consecutive packets.
This requirement can be satisfied by adding in parallel to the control pipeline
register ( outPort ) a data pipeline register that acts as a 1-slot pipelined elastic buffer
(EB) (see Sect. 2.1.3 for details). RC would be initiated by the frontmost position
of the normal input buffer, while all the rest tasks such as request generation, SA
and ST would start from the intermediate pipelined EB, thus allowing the parallel
execution of RC for the new packet and SA-ST for the tail of the old packet. This
organization is shown in Fig. 5.7 .
In this configuration, when a head flit appears in the frontmost position of the
input buffer, it executes RC and updates the outPort register, while moving in
parallel to the intermediate EB. The EB will only write incoming data when empty,
or when it is about to become empty in the same cycle (dequeued). On the contrary,
when a tail flit moves to the intermediate EB it will reset the outPort register when
it is ready to leave the EB (it received a grant). If in the same cycle, the head flit of
a new packet tries to set the outPort register and move to the EB and the tail flit of
Search WWH ::




Custom Search