Hardware Reference
In-Depth Information
from the head position until a NULL pointer is found. However, this would add
significant latency, that is not needed when being able access the last flit position
directly, with almost no overhead.
6.3.2
Primitive Shared Buffer for VCs: ElastiStore
Buffer sharing can be pushed to the limit and design low-cost VC buffers that offer
the minimum buffering possible and still allow to a single VC to enjoy 100 %
throughput of data transfer. The buffering architecture, called ElastiStore , utilizes
only V C 1 buffers for V VCs (Seitanidis et al. 2014a ). Each VC owns a single
buffer, which is enough in the case of uniform utilization, where each VC receives
a throughput of 1=M; with 2 M V . Furthermore, when a single VC uses
the channel without any other VC being active, i.e., M D 1, it receives 100 %
throughput, and, in the case of a stall, it may use the additional buffer available in
ElastiStore. This additional buffer is shared dynamically by all VCs, although only
one VC can have it in each clock cycle. However, when all VCs, except one, are
blocked, and the shared buffer is utilized by a blocked VC, then the only active VC
will get 50 % of the throughput, since it effectively sees only one buffer available
per channel. Note that the baseline VC-based buffer of Fig. 6.5 , which allocates 2
buffers to each VC would allow this active VC to enjoy full channel utilization.
Figure 6.11 illustrates an example of flit flow between two ElastiStores that each
one supports 2 VCs. In the first cycles, each VC receives 50 % of the throughput per
channel (M D 2), and, at each step, they utilize only one buffer slot. In those cycles,
the shared registers of the two ElastiStores are not utilized. The shared buffers are
used between cycles 4 and 7 to accommodate the stalled words of VC B. In those
cycles, VC A - which is not blocked - continues to deliver its words to the output
of the channel.
Elastistore#0
Elastistore#1
Input
output
channel
0
1
2
3
4
5
6
7
cycles
Input
A 2
B 2
A 3
B 3
A 4
B 4
A 5
A 1
A 2
B 1
A 2
B 2
A 3
A 3
A 4
A 4
A 5
B 3
B 4
B 4
EStore#0
B 1
B 2
B 3
B 3
B 3
Channel
A 0
B 0
A 1
B 1
A 2
B 1
A 3
A 3
A 1
A 4
B 3
EStore#1
B 2 B 1
B 2 B 1
B 2
B 0
B 2
Output
A 0
B 0
A 1
A 2
X
B 1
A 3
B 2
VC B stalls
VC B released
Fig. 6.11
An example of the of flow of flits on a channel that supports 2 VCs and utilizes
ElastiStores at both ends of the link
 
Search WWH ::




Custom Search