Hardware Reference
In-Depth Information
head flit does not have any choice rather than to wait for the selected output VC
to become available. Even if other eligible output VCs are available the head flit
cannot change its output VC request decided during LVA1. Instead of performing
the lookahead VA1 step in the previous router, VA1 can be also performed at the
end of link traversal. In this way, the head flit decides for an output VC request at
the time that it is written to the input VC buffer. If done this way, it is more easy for
the head flit to know the status of the output VCs since it already reached the next
router.
Inevitably, the lack of information about which output VCs are available during
lookahead VA1, will cause several input VCs to pre-select and fight for the same
output VC, even if there are other output VCs that are available to use. Depending
on the application and the number of VCs in the network this feature may limit
the throughput of the network. From our measurements only slight reductions are
expected that can be possibly alleviated by the increased operating speed offered by
lookahead VA1.
8.3
VC Allocation Without VA2: Combined Allocation
By either not letting a packet to change VC while it is traversing the network or
performing VA1 in a lookahead manner, we achieved to remove VA1 from the
critical path of VC allocation. In this case, the needed allocation steps that should
be executed in series include VA2 to match an output VC to a certain input VC
and then the two steps of SA that match an input VC to an output port on a cycle-
by-cycle basis. Even in this reduced-complexity allocation organization, we assume
that all requesting input VCs can be allocated simultaneously to available output
VCs assuming that no other input VC is asking for the same output VC. However,
we know that due to SA1 only one input VC will be allowed to leave the router from
each input. This is a structural requirement imposed by the datapath of the router
(the input VCs of the same input share an input of the crossbar). Therefore, there is
no reason for letting more than one VCs per input to get matched to an output VC;
at the end, at most one VC per input will be allowed to leave the router.
The restriction that at most one new VC per input is allowed to match to a new
VC per output, can be applied by allocating an output VC only to the input VC that
won in SA. In this way, the allocation of an output port in SA is accompanied by
the allocation of an output VC. This combined allocation eliminates completely the
VA2 stage of VC allocation (Lu et al. 2012 ). VA1 is still needed in order for every
input VC to know beforehand which output VC to request, when it wins in SA.
From the previous discussion we know that the VA1 step can be performed either
in series with SA, or using lookahead VA1. Since the selected output VC will be
used directly for driving SA, it should be checked both for availability and for
available credits. Credit masking can be performed prior to VC availability checking
Search WWH ::




Custom Search