Hardware Reference
In-Depth Information
VCbuf#0
VCBuf#1
Input
output
link
cycles
0
1
2
3
4
5
6
7
8
Input
A 2
A 1
B 2
A 3
B 3
A 4
B 4
A 5
B 5
A 6
A 2
A 2
A 3
A 3
A 4
A 5
VCbuf#0
B 1
B 1
B 2
B 2
B 3
B 3
B 4 B 3
B 4
B 5 B 4
Link
A 0
B 0
A 1
A 1
A 2
B 1
A 3
B 2 B 1
X
A 4 A 3
B 2
B 1
A 4
B 3 B 2
A 3
A 5 A 4
B 3
B 2
VCbuf#1
B 0
B 1
B 2 B 1
Output
A 0
B 0
A 1
A 2
A 4
VC B stalls
VC B released
Fig. 6.6
Flit flow on a channel between two primitive VC buffers that employ a 2-slot EB for
each VC
On the read side of the VC buffer, an arbitration mechanism will select only
one of the valid VCs, that is also ready downstream, to leave the buffer. A packet
that belongs to the i th VC can be hosted either to the same VC in the next buffer
or to a different VC provided that it has won exclusive access to this VC. VC-
based flow control does not impose any rules on how the VCs should be assigned
between a sender buffer and a receiver buffer. Allowing packets to change VC
in-flight can be employed when the routing algorithm does not impose any VC
restrictions (e.g., XY routing does not even require the presence of VCs). However,
if the routing algorithm and/or the upper-layer protocol (e.g., cache coherence)
place specific restrictions on the use of VCs, then arbitrary in-flight VC changes are
prohibited, because they may lead to deadlocks. In the presence of VC restrictions,
the allocator/arbiter should enforce all rules during VC allocation to ensure deadlock
freedom. The VC selection policies used inside the routers will be thoroughly
discussed in the next chapter.
Figure 6.6 depicts a running example of a VC-based pipeline using a 2-slot EB
per VC. The two active VCs each receive a throughput of 50 % and each VC uses
only one buffer out of the two available per VC. The second buffer is only used
when a VC stalls. This uniform utilization of the channel among different VCs
leads to high buffer underutilization. The buffer underutilization gets worse when
the number of VCs increases. In the case of V active VCs, although the physical
channel will be fully utilized, each VC will receive a throughput of 1=V and use
only one of the two available buffer slots since it is accessed once every V cycles.
Only under extreme congestion will one see the majority of the second buffers
of each VC occupied. However, even under this condition, a single active VC is
allowed to stop and resume transmission at a full rate independently from the rest
VCs. This feature is indeed useful in the case of traffic originating only from a
single VC, where any extra cycles spent per link will severely increase the overall
latency of the packet. However, in the case of multiple active VCs, whereby each
 
Search WWH ::




Custom Search