Hardware Reference
In-Depth Information
Chapter 9
Pipelined Virtual-Channel-Based Routers
The overall delay of a single-cycle VC-based router is the result of the delay
of the circuits that are responsible for executing the tasks of the router, and the
relative connections and dependencies between those tasks. Although the fast
allocation organizations presented in Chap. 8 remove some of the across-tasks
dependencies and reduce the delay of the router, enjoying really fast VC-based
router implementations calls for pipelined organizations. The tasks involved per
packet and per flit in a VC-based pipelined router are executed in multiple cycles.
However, in each cycle, multiple operations evolve in parallel for several packets
and flits, thus achieving high throughput, while still operating under a high clock
frequency due to pipelining.
Pipelining the operation of a VC-based router has been the topic of both academic
and industrial research the recent years. Highly efficient pipelined organizations
have been presented, such as Hoskote et al. ( 2007 ), Howard et al. ( 2010 ), and Azimi
et al. ( 2009 ), that deal with both shallow or deep pipelined organizations. In any
case, the designs presented involve only one design point of the design space of
pipelined routers and the tradeoffs of adding or removing pipeline stages are not
discussed. This chapter aims at closing this gap and present the whole design space
of pipelined VC-based routers and the throughput-complexity implications of each
design choice.
Similar to the pipelined WH routers presented in Chap. 5 , the pipelined orga-
nization of VC-based routers will be described in a modular manner, beginning
from the timing isolation through pipeline of the basic steps involved in a VC-
based router such as RC, VA, SA. Then, following a compositional approach,
multi-stage pipelined organizations will be derived by just combining the primitive
pipelined organizations of each stage. We believe that this customizable construction
of pipelined VC-based routers, that delivers pipelined configurations by connecting
simpler blocks in a plug-and-play manner, will help in understanding better the
operation of complex pipelined organization and the involved timing-throughput
tradeoffs.
Search WWH ::




Custom Search