Hardware Reference
In-Depth Information
the maximum allowed value for each credit counter may change dynamically. This
feature may complicate a lot the update of the credit counters (Nicopoulos et al.
2006 ).
Instead, we present a different approach that keeps the depth of each credit
counter constant and simplifies a lot the handling of the credits of each VC
in a shared buffer configuration. The sender keeps one credit counter, for each
downstream VC that refers to its private buffer space and a counter for the shared
buffer that counts the available buffer slots in the shared region. A VC is eligible
to send a new flit when there is at least one free position either at the private or
the shared buffer ( creditCounter Œi > 0 or creditShared >0). Once the flit is sent
from the i th VC, it decrements the credit counter of the i th VC. If the credit counter
of the i th VC was already equal to or smaller than zero, this means that the flit
consumed a free slot of the shared buffer and the counter of the shared buffer is also
decremented.
Since the state of each VC is kept at the sender, the receiver only needs to send
backwards a credit-update signal, including a VC ID, which indexes the VC that
has one more available credit for the next cycle. On a credit update that refers to the
j th VC, the corresponding credit counter is increased. If the credit counter is still
smaller than zero, this means that this update refers to the shared buffer. Thus, the
credit counter of the shared buffer is also increased. Please note that even if there is a
separate credit counter for the shared buffer the forward valid signals and the credit
updates refer only to the VCs of the channel and no separate flow control wiring is
needed in the channel to implement a shared buffer at the receiver.
In this case, safe operation is guaranteed even if there is only 1 empty slot per
VC. In the case of single-cycle links (L f D 1, L b D 1), each VC can utilize up to 2
buffer slots before it stops, and those positions are enough for enabling safe and full
throughput operation per VC. Therefore, when each VC can utilize at least 2 buffer
slots of either private or shared buffer space it does not experience any throughput
limitations. If a certain VC sees only 1 buffer slot available then inevitably it should
limit its throughput to 50 % even if it is the only active VC on the link.
The generic shared buffer architecture that includes a private buffer space per
VC and a shared buffer space across VCs can be designed in a modular and
extensible manner if we follow certain design rules (operational principles). First,
any allocation decision regarding which VC should dequeue a flit from the buffer,
is taken based only on the status of the private VC buffers; the private buffers act
as parallel FIFOs each one presenting to the allocation logic just one frontmost flit
per VC. Second, when the private buffer per VC drains one flit that empties one
position, the free slot is refilled in the same cycle, either with a flit possibly present
in the shared buffer, or directly from the input, assuming the new flit belongs to the
same VC. Whenever the private buffer per VC cannot accommodate an incoming
flit, a shared slot is allocated, where the flit is stored. As soon as the private space
becomes available again, the flit is retrieved from the shared buffer and moves to the
corresponding private buffer.
Every time a VC dequeues a flit from its private buffer, it should check the shared
buffer for another flit that belongs to the same VC. Figure 6.8 demonstrates the
Search WWH ::




Custom Search