Baseline Switching Modules and Routers - Microarchitecture of Network-on-Chip Routers

Hardware Reference

In-Depth Information

Equivalently the slot counter can be moved at the output of the switching module

(at the other side of the link) and act as a local output credit counter as shown in

Fig. 3.4 b. The output credit counter mirrors the available buffer slots of the output.

It sends a ready signal to all inputs when the number of available buffer slots at the

output buffer is greater than zero. The inputs qualify their valid signals exactly the

same way as in the case of the ready/valid handshake. Therefore, when a certain

input is connected to the output (the output was available and the arbiter granted the

particular input), it knows exactly about the availability of new credits at the output

via the output credit counter.

It should be noted that the ready signal that is asserted when creditCounter >0,

is only driven by the current state of the credit counter. The credit decrement and

increment signals update only the value of the credit counter and the new value

will be seen by the ready signal in the next clock cycle. Therefore, the dependency

cycle formed by credit decrement

!

ready

!

request generation

!

arbiter's grant

!

credit decrement is broken after the ready signal, which also helps in isolating

the timing paths starting request generation logic. Equivalently, each input buffer,

independent from the rest, sends also its own credit update in the backward direction

once it dequeues a new flit.

Using the output credit counter simplifies also the addition of pipeline stages on

the link. For example in Fig. 3.4 c the output of the multiplexer is isolated by a simple

pipeline register, i.e., outgoing data cannot stop at this point, and the readiness of

the output buffer is handled via the output credit counter. As described also in the

previous chapter referring to a single point-to-point link, even if additional pipeline

stages are added between inputs and the output once the ready signal is consumed

by the input without any further delay the credit protocol guarantees maximum

throughput will the least buffering requirements. In this case, the receiver needs

to provide 3 buffer slots to absorb the in-flight traffic due to the increased forward

and backward latency L f D 2, L b D 2.

3.1.2

Granularity of Buffer Allocation

Under WH switching principle, each flit of a packet can move to the output assuming

that at least one credit is available. On the contrary VCT requires flow control to

extend at the packet level by allocating any buffering resources at packet granularity.

In both cases the flits of the packets are not interleaved at the output. Interleaving is

enabled by virtual channels that will be presented in the following chapters.

In a packet-based flow control, which is commonly used in off-chip networks,

both the channels and the buffers are allocated in units of packets, while flit-based

flow control allocates both resources in units of flits. On-chip networks have often

utilized the flit-based flow control. The main difference between packet and flit-

level flow control is in how the buffer resource is allocated. With packet-based flow

control before any packet moves to an output, the buffer for the entire packet needs

to be allocated; thus, for a packet of L flits, an input needs to obtain L credits before

Microarchitecture of Network-on-Chip Routers

Search WWH ::

Custom Search

Home