Hardware Reference
In-Depth Information
SA stage. The behavior of each solution in terms of throughput can be easily derived
by the behavior experienced by each sub-component analysis in Sects. 5.2 and 5.3 .
Whenever the RC stage is pipelined only in the control path, one idle cycle should
be added between the end of a packet (tail flit) and the start of the next one (head flit)
that arrives at the same input in consecutive cycles, assuming that the input buffer
is allocated not atomically. On the contrary, when the SA stage is pipelined only in
the control path then an idle cycle is inserted for the head flit; the head flit is obliged
to wait in the input buffer for one cycle, until the grant from the SA arrives.
Depending on the exact delay profile of the modules that participate in the design
of a router, such as routing computation, request masking and arbitration, grant
handling and dequeue operations, as well as credit consume and crossbar traversal,
the presented pipelined solutions may lead to different designs in the energy-delay
space. In any case, the selection of the appropriate pipeline organization is purely
application-specific and needs scenario-specific design space exploration. In this
chapter, our goal was to present the major design alternatives in a customizable
manner, e.g., every design can be derived by combining the two primitive pipelined
organizations for the RC and SA stage that lead to reasonable configurations. Other
ad-hoc solutions that eliminate the idle cycles of the control pipeline without the
need for data pipeline stages may be possible after certain “architectural” tricks, but
their design remains out of the scope of this topic.
5.5
Take-Away Points
The main tasks of a wormhole router includes RC, SA and ST. Executing the
tasks of the router, in an overlapped manner, in different pipeline stages can be
derived by following a compositional approach, where the primitive pipeline stages,
are stitched together to form many meaningful pipelined configurations. Pipeline
registers can be added either in the control or in the datapath of router, leading to
different tradeoffs in terms of the achieved clock frequency and the idle cycles that
appear in the flow of flits inside the router's pipeline.
Search WWH ::




Custom Search