Hardware Reference
In-Depth Information
Please keep in mind that the full implementation of LRC requires the addition of
RC computation units at the network interfaces as well, which prepare the output
port requests for the routers that are connected at the edges of the network.
3.7
Hierarchical Switching
The operation of the switch described in the previous paragraphs can be easily
decomposed to primitive blocks that handle arbitration and multiplexing in a
distributed manner. By using the primitive merge units described in Sect. 3.1.3
(see Fig. 3.5 ) and splitting the data arriving at each input port to the correct
output, one can design an arbitrary distributed router architectures (Huan and
DeHon 2012 ;Rocaetal. 2012 ;Balkanetal. 2009 ; Rahimi et al. 2011 ). An
example is shown in Fig. 3.19 , which depicts a router with 4 inputs and 4 outputs.
Upon arrival at the input of the router, each packet performs routing computation
(RC). Subsequently, depending on buffer availability, output availability, and the
allocation steps involved in each merging unit - the flits of the packet are forwarded
to the merging unit of the appropriate output. Integration of the merging units
is straightforward, since they all operate under the same ready/valid handshake
protocol (or credit-based flow control). All router paths from input to output see
a pipeline of merging units of log 2 N stages. Moving to the next router involves one
extra cycle on the link; link traversal does not include any merging units and is just
a one-to-one connection of elastic buffers.
merge
unit
RC
In#0
arb
O ut#0
RC
In#1
merge
unit
RC
In#2
merge
unit
RC
In#3
Out#3
merge
unit
merge
unit
Fig. 3.19 The parallel connection of multiple inputs to multiple outputs can be established using
a hierarchical merging tree of smaller switching elements at each output, and a split stage at the
inputs that guides incoming packets to their destined output based on the outcome of the routing
computation logic
 
Search WWH ::




Custom Search