PipelinedWormhole Routers - Microarchitecture of Network-on-Chip Routers

Hardware Reference

In-Depth Information

parallel consumes the necessary credit and updates the state of its destined output.

The grants return in cycle 2 that causes the head flit to get dequeued from the input

buffer (DQ) and move to the crossbar. Since the head flit will leave at the end of

cycle 2, it should not produce a new request in this cycle. This is handled by the

request masking logic of Fig. 5.9 that cuts any requests generated in cycle 2 and in

effect cuts any grants delivered in cycle 3. The rest inputs, although they have not

received their grants, they don't produce also any requests in cycle 2 for the same

output; their requests are blocked since the outAvailable flag of their destined output

has been lowered in cycle 1 during SU.

The body flit arrives at the frontmost position of the input buffer at the end of

cycle 2. In the beginning of cycle 3, it generates its own set of requests that will be

delivered in cycle 4. The requests of the body flit survive the request masking logic

since in cycle 3 the input does not receive any grants. Since the output is locked

by the grant given to the head flit of the packet, the body flit will receive a grant

for sure. The same operation continues in the next cycles where an empty cycle is

added for each flit after SA.

5.3.2

Alternative Organization of the SA Pipeline Stage

The delay between request generation and grant delivery leads to an idle cycle

between every pair of flits of the packet. By observing that the body and tail flits

do not need to generate any request and they can move directly to ST by inheriting

the grants produced by the head flit of the same packet, we can remove all the idle

cycles experienced by the non-head flits. The body and tail flits before moving to

ST should just check the availability of buffer slots at the output buffer.

In this configuration the grants produced by the SA should be kept constant for all

packet's duration. According to the organization depicted in Fig. 5.11 , the pipeline

register that was used to register the SA grants per output, is now replaced by a

register that is updated under the same conditions used to update the outAvailable

flag: grants are stored or erased when at least a request by a head or a tail flit is

made, respectively. Now, once a head flit wins arbitration, grants persist until the

tail flit resets them. Although the head flit should wait for the grants to return, the

body and tail flits are dequeued once they have an active request (a request is always

qualified by the status of the credit counters). This condition is implemented by the

multiplexer in the backward direction. Please notice that the request mask used in

Fig. 5.9 is removed and the initial request generation logic is restored at the input

side.

The stored grants always drive the select lines of the output multiplexer transfer-

ring to the output register data and their valid signals. However, we should guarantee

that the valid signal seen at the output buffer corresponds always to a legal flit; a

flit is legal if it is both valid and the output buffer has enough credits to accept it.

Delivering to the output multiplexer the valid signal of the input buffer as done in

previous cases of Figs. 5.2 , 5.4 , 5.7 and 5.9 is not enough in this configuration, since

Search WWH ::

Custom Search

Home