Virtual-Channel Flow Control and Buffering - Microarchitecture of Network-on-Chip Routers - page 94

Hardware Reference

In-Depth Information

VCbuf#0

VCBuf#1

Input

output

link

cycles

0

1

2

3

4

5

6

7

8

Input

A 2

A 1

B 2

A 3

B 3

A 4

B 4

A 5

B 5

A 6

A 2

A 2

A 3

A 3

A 4

A 5

VCbuf#0

B 1

B 1

B 2

B 2

B 3

B 3

B 4 B 3

B 4

B 5 B 4

Link

A 0

B 0

A 1

A 1

A 2

B 1

A 3

B 2 B 1

X

A 4 A 3

B 2

B 1

A 4

B 3 B 2

A 3

A 5 A 4

B 3

B 2

VCbuf#1

B 0

B 1

B 2 B 1

Output

A 0

B 0

A 1

A 2

A 4

VC B stalls

VC B released

Fig. 6.6

Flit flow on a channel between two primitive VC buffers that employ a 2-slot EB for

each VC

On the read side of the VC buffer, an arbitration mechanism will select only

one of the valid VCs, that is also ready downstream, to leave the buffer. A packet

that belongs to the i th VC can be hosted either to the same VC in the next buffer

or to a different VC provided that it has won exclusive access to this VC. VC-

based flow control does not impose any rules on how the VCs should be assigned

between a sender buffer and a receiver buffer. Allowing packets to change VC

in-flight can be employed when the routing algorithm does not impose any VC

restrictions (e.g., XY routing does not even require the presence of VCs). However,

if the routing algorithm and/or the upper-layer protocol (e.g., cache coherence)

place specific restrictions on the use of VCs, then arbitrary in-flight VC changes are

prohibited, because they may lead to deadlocks. In the presence of VC restrictions,

the allocator/arbiter should enforce all rules during VC allocation to ensure deadlock

freedom. The VC selection policies used inside the routers will be thoroughly

discussed in the next chapter.

Figure 6.6 depicts a running example of a VC-based pipeline using a 2-slot EB

per VC. The two active VCs each receive a throughput of 50 % and each VC uses

only one buffer out of the two available per VC. The second buffer is only used

when a VC stalls. This uniform utilization of the channel among different VCs

leads to high buffer underutilization. The buffer underutilization gets worse when

the number of VCs increases. In the case of V active VCs, although the physical

channel will be fully utilized, each VC will receive a throughput of 1=V and use

only one of the two available buffer slots since it is accessed once every V cycles.

Only under extreme congestion will one see the majority of the second buffers

of each VC occupied. However, even under this condition, a single active VC is

allowed to stop and resume transmission at a full rate independently from the rest

VCs. This feature is indeed useful in the case of traffic originating only from a

single VC, where any extra cycles spent per link will severely increase the overall

latency of the packet. However, in the case of multiple active VCs, whereby each

Next Page

Microarchitecture of Network-on-Chip Routers

Search WWH ::

Custom Search

Home