High-Speed Allocators for VC-Based Routers - Microarchitecture of Network-on-Chip Routers

Hardware Reference

In-Depth Information

head flit does not have any choice rather than to wait for the selected output VC

to become available. Even if other eligible output VCs are available the head flit

cannot change its output VC request decided during LVA1. Instead of performing

the lookahead VA1 step in the previous router, VA1 can be also performed at the

end of link traversal. In this way, the head flit decides for an output VC request at

the time that it is written to the input VC buffer. If done this way, it is more easy for

the head flit to know the status of the output VCs since it already reached the next

router.

Inevitably, the lack of information about which output VCs are available during

lookahead VA1, will cause several input VCs to pre-select and fight for the same

output VC, even if there are other output VCs that are available to use. Depending

on the application and the number of VCs in the network this feature may limit

the throughput of the network. From our measurements only slight reductions are

expected that can be possibly alleviated by the increased operating speed offered by

lookahead VA1.

8.3

VC Allocation Without VA2: Combined Allocation

By either not letting a packet to change VC while it is traversing the network or

performing VA1 in a lookahead manner, we achieved to remove VA1 from the

critical path of VC allocation. In this case, the needed allocation steps that should

be executed in series include VA2 to match an output VC to a certain input VC

and then the two steps of SA that match an input VC to an output port on a cycle-

by-cycle basis. Even in this reduced-complexity allocation organization, we assume

that all requesting input VCs can be allocated simultaneously to available output

VCs assuming that no other input VC is asking for the same output VC. However,

we know that due to SA1 only one input VC will be allowed to leave the router from

each input. This is a structural requirement imposed by the datapath of the router

(the input VCs of the same input share an input of the crossbar). Therefore, there is

no reason for letting more than one VCs per input to get matched to an output VC;

at the end, at most one VC per input will be allowed to leave the router.

The restriction that at most one new VC per input is allowed to match to a new

VC per output, can be applied by allocating an output VC only to the input VC that

won in SA. In this way, the allocation of an output port in SA is accompanied by

the allocation of an output VC. This combined allocation eliminates completely the

VA2 stage of VC allocation (Lu et al. 2012 ). VA1 is still needed in order for every

input VC to know beforehand which output VC to request, when it wins in SA.

From the previous discussion we know that the VA1 step can be performed either

in series with SA, or using lookahead VA1. Since the selected output VC will be

used directly for driving SA, it should be checked both for availability and for

available credits. Credit masking can be performed prior to VC availability checking

Search WWH ::

Custom Search

Home