Policing and Shaping (QOS-Enabled Networks) Part 2

Dual-Rate Token Buckets

A policer can be defined to simply limit all traffic indiscriminately to a certain bandwidth, or it can be defined to be more granular. For an example of a scenario requiring greater granularity, consider that all traffic arriving from a customer should be globally policed to 10 Mbps and that the input traffic flow contains two types of traffic, voice and non-voice. Voice traffic should be policed to 2 Mbps, and this rate must be guaranteed. That is, as long as voice traffic remains below the 2-Mbps barrier, it must always be transmitted, as illustrated in Figure 6.11.

Figure 6.11 Different types of traffic with different policing requirements

Figure 6.12 Interconnection between the credit rates of two token buckets

The first possible solution to comply with these requirements is to use two independent token buckets to limit voice traffic to 2 Mbps and non-voice traffic to 8 Mbps. This scheme guarantees that voice traffic never exceeds the 2-Mbps barrier and also meets the requirement for a guaranteed rate. However, if no voice traffic is present, the total amount of bandwidth that non – voice traffic can use is nevertheless limited to 8 Mbps. The leftover bandwidth created by the absence of voice traffic is not accessible to non-voice traffic, which can be good or bad depending on the desired behavior. This waste of bandwidth is the price to pay for placing traffic into independent token buckets. The lack of communication between the two buckets implies that the bucket into which non-voice traffic is placed must implement the 8-Mbps bandwidth limit to assure that voice traffic always has access to its 2-Mbps guaranteed rate.

Another possible approach is defining two token buckets but linking the credit rates of both. In this approach, voice traffic is placed in a bucket called "premium" that imposes a 2-Mbps rate and non-voice traffic is placed in a bucket called "aggregate" that imposes a 10-Mbps rate. This scheme allows non-voice traffic to use up to 10 Mbps in the absence of voice traffic. However, it raises the concern of how to assure the 2-Mbps guaranteed rate to voice traffic. The answer is to link the credit rates of both token buckets. Every time a voice packet is transmitted, the available credit meters of both the premium and the aggregate are decremented, while transmission of a non-voice packet decrements only the aggregate available credit meter, as illustrated in Figure 6.12.

Dual-rate policers are popular and are commonly implemented together with the metering tool. The most common implementation deployed is the two-rate, three-color marker, defined in RFC 2698 [1].

Shapers and Leaky Buckets

As previously discussed in Part One, the goal of a shaper is to make the received traffic rate conform to the bandwidth value in the shaper definition, also called the shaping rate. A shaper has one input and one output, and the output traffic flow conforms to the defined shaper rate. Any excess traffic is stored inside the shaper and is transmitted only when possible, that is, when transmitting it does not violate the shaping rate.

A shaper is implemented using a leaky bucket concept that consists of a queue of a certain length, called the delay buffer. A guard at the queue head assures that the rate of the traffic leaving the leaky bucket conforms to the shaping rate value, as illustrated in Figure 6.13, which represents the shaping rate as a dotted line.

Usually, the shaping rate is measured in bits per second and the delay buffer length in milliseconds or microseconds. The shaping rate is a "constant" value in the sense that if the desired result is to have the input traffic shaped to X Mbps, the shaping rate should be set to the value X. The result is that only the delay buffer length is a variable.

Two factors must be taken into account when dimensioning the delay buffer length parameter. The first is that the length of the delay buffer is a finite value, so there is a maximum amount of excess traffic that can be stored before the delay buffer fills up and traffic starts being dropped. The second factor is that when excess traffic is placed inside the delay buffer, it is effectively being delayed, because the shaper guard enforces that traffic waits inside the delay buffer until transmitting it does not violate the shaper rate. This behavior is illustrated in Figure 6.13, which shows that the input traffic graph ends at T1, but the output traffic ends at T2. This is a key point to remember: the shaper’s capability to store excess traffic is achieved at the expense of introducing delay into the transmission of traffic.

Figure 6.13 Leaky bucket operation

The delay buffer is effectively a queue, so quantifying how much delay is inserted is dependent on the queue length, its fill level (how many packets are already inside the queue), and the removal from the queue speed. The speed of removal from the queue is obviously a function of the shaping rate, which is itself set to a constant value – the desired rate for traffic exiting the shaper tool.

Predicting the queue fill level at a specific point in time is impossible. However, it is possible to analyze the worst-case scenario. When a packet is the last one to enter a full queue, the time it has to wait until it reaches the queue head (and thus to become the next packet to be removed from the queue) can be represented by the queue length value. Thus, the worst-case scenario of the delay introduced by the shaping tool can be represented as the length of the delay buffer. Following this logic for the jitter parameter, the least possible delay introduced is zero and the maximum is the length of the delay buffer, so the smaller the gap between these two values, the smaller the possible jitter insertion.

As previously discussed for real-time traffic, when sensitivity to delay and jitter is high, the shaper tool should not be applied at all, or it should be applied with a very small delay buffer length. At the opposite end of the spectrum, nontrealt time traffic has a greater margin for dimensioning the length of the delay buffer because this traffic is less sensitive to delay and jitter.

Excess Traffic and Oversubscription

The existence of excess traffic is commonly associated with the oversubscription phenomenon, the concept of having a logical interface with a certain bandwidth value that during some periods of time may receive more traffic than it can cope with, as illustrated in Figure 6.14.

Oversubscription is popular in scenarios in which it is fair to assume that all sources of traffic do not constantly transmit to the same destination simultaneously, so the bandwidth of the interface to the destination can be lower than the sum of the maximum rates that each source can transmit. However, with oversubscription, transient situations can occur in which the bandwidth of the interface to the destination is insufficient to cope with the amount of bandwidth being transmitted to it, as illustrated in Figure 6.14.

Figure 6.14 Oversubscription scenario

Oversubscription is a scenario commonly used with the shaping tool, because its application makes it possible to guarantee that the rate of traffic arriving at the logical interface to a destination complies with the shaper’s rate. So by setting the shaper rate equal to the logical interface bandwidth, any excess traffic generated by the transient conditions is stored inside the shaper tool instead of being dropped. Such logic is valid for a shaping tool applied at either an ingress or egress interface.

A typical network topology that creates transient conditions of excess traffic is a hub and spoke, in which multiple spoke sites communicate with each other through a central hub site. In terms of the connectivity from the spoke sites towards the hub, the bandwidth of the interface connecting to the hub site can be dimensioned as the sum of the maximum rates of each spoke site, or alternatively the bandwidth can be set at a lower value and the shaping tool can absorb any excess traffic that may exist. However, the second approach works only if the presence of excess traffic is indeed a transient condition, meaning that the delay buffer of the shaper does not fill up and drop traffic, and only if the traffic being transmitted can cope with the delay that the shaping tool inserts.

This topology is just one scenario that illustrates the need for dealing with excess traffic. Several others exist, but the key concept in all scenarios is the same.

The business driver for oversubscription is cost savings, because it requires that less bandwidth be contracted from the network.

Comparing and Applying Policer and Shaper Tools

So far, this topic has focused on presenting separately the details about the token and leaky buckets concepts used to implement the policer and shaper tools, respectively. This section compares them both side by side.

As a starting point, consider Table 6.1, which highlights the main characteristics of the two tools. The striking similarity is that the end goal of shapers and policers is the same: to have traffic that exits the tool conform to a certain bandwidth value expressed in bits per second. Let us now focus on the differences in terms of how the two achieve this same goal.

The first difference is that the policer introduces no delay. As previously discussed, the delay introduced by shaping is the time that packets have to wait inside the delay buffer until being transmitted, where, in the worst-case scenario, the delay introduced equals the length of the delay buffer.

Table 6.1 Differences between policing and shaping

Tool	Input parameters	End goal	Delay introduced	Excess traffic
Policer	Bandwidth limit	Traffic conforming to a certain	No	Dropped
	Burst size limit
Shaper	Shaping rate	bandwidth value	Yes	Stored
	Delay buffer length

Obviously, delay is introduced only if there is excess traffic present. The fact that variations in the delay inserted are possible implies that the introduction of jitter is also a factor that must be accounted for. It is quite common in the networking world to use the analogy that the policer burst size limit parameter is a measure of delay introduced, but this is incorrect. The policer action introduces no delay whatsoever, independent of the burst size limit value configured.

The second difference is that the policer drops excess traffic (which is why no delay is inserted), but the shaper stores it while space is available in the delay buffer, dropping the excess traffic only after the buffer fills up. However, as previously discussed, the policer can absorb traffic bursts as a function of its burst size limit parameter. It is interesting to highlight the differences between absorbing traffic bursts and allowing what we are calling excess traffic.

Suppose that traffic is crossing the policer at a constant rate, and that at a certain point in time the rate increases (that is, bursts occur), thus increasing the arrival rate of packets at the token bucket. If the absorption of burst leads to a complete credit depletion, when the "next" packet arrives at the token bucket, no resources (credits) are available to transmit it and it is dropped. Now, if we look at this situation from a shaper perspective, when a packet cannot be transmitted, it is not dropped, but rather it is stored inside the delay buffer. The packet waits until it can be transmitted without violating the shaping rate. The only possible exception is when the delay buffer is already full, and then the packet is dropped.

We mentioned at the beginning of this topic that we are considering a policer action of discarding traffic that violates the defined policer rate (that is, hard policing). Let us now examine the effects of using an action other than discard, an action commonly named soft policing. Consider a policer named X, whose action regarding traffic that exceeds the defined policer rate is to accept it and mark it differently (e.g. as "yellow," while traffic below the policer rate is marked as "green"). One could argue that policer X also accepts excess traffic, which is true. However, policer X is allowing yellow traffic to pass through it (not discarding it and not storing it, but just marking it with a different color). So the total amount of traffic (green and yellow) present at the output of the policer X does not conform to the defined policer rate, because excess traffic is not discarded or stored.

Deciding where to apply the policer and shaper tools in the network effectively boils down to the specific behavior that is desired at each point. Taking into account the differences highlighted in Table 6.1 and in the previous paragraphs, the most striking difference between the two is how each one deals with excess traffic. Practical examples for the application of both tools are presented in the case studies in Part Three, but at this stage, we highlight some of the common application scenarios.

Some common scenarios involve a point in the network at which the bandwidth needs to be enforced at a certain value, either because of a commercial agreement, an oversubscribed interface, or a throughput reduction because of a mismatch between the amount of traffic one side can send and the other side can receive.

Figure 6.15 Scenario for policing and shaping applicability

To analyze the applicability of the policing and shaping tools, let us use the scenario illustrated in Figure 6.15, which shows two routers R1 and R2, with traffic flowing from R1 towards R2.

From a router perspective, the policer and shaper tools can be applied to both ingress traffic entering the router and egress traffic leaving the router. The decision about where to apply the tools depends on the behavior that needs to be implemented, which is affected in part by the trust relationship between the two routers. For example, if R1 and R2 have an agreement regarding the maximum bandwidth that should be used on the link that interconnects the two routers, should R2 implement a tool to enforce the limit, or can it trust R1 not to use more bandwidth than what has been agreed?

The trust parameter is a crucial one, and typically the network edge points (where the other side is considered untrustworthy) are where these types of tools are commonly applied.

Let us start by analyzing the scenario of excess traffic. If R1 has excess traffic, it has two options. The first is shape the traffic, then transmit it to R2. The second option is for R1 to transmit the traffic and for R2 to use ingress shaping to deal with the excess. If R1 considers R2 to be untrustworthy regarding how it deals with excess traffic, the second option should be ruled out. If, on the other hand, R1 considers R2 to be trusted, the shaping tool can be applied at the ingress on R2.

Regarding the need to enforce the traffic from R1 to R2 at a certain rate by using the policer tool, if the relationship in place between the two routers is one of trust, a tool to enforce the rate can be applied at the egress on R1 or at the ingress on R2. Ideally, though, the tool should be applied on the egress at R1 to avoid using link capacity for traffic that will just be dropped at the ingress on R2. If R2 does not trust R1 to limit the traffic it sends, the only option is for R2 to implement an ingress tool to enforce the limit.

The discussion in this section provides only generic guidelines. What the reader should focus on is the requirements to be implemented together with the capabilities and impacts of each tool. And you should never shy away from using the tools in a different way from the ones illustrated here.

Conclusion

Throughout this topic, we have discussed the mechanics used to implement policing and shaping, and the differences between them. The aim of both the policer and shaper tools is to impose an egress rate on the traffic that crosses it. The policer also imposes a limit on how bursty the traffic can be, while the shaper eliminates such traffic bursts at the expense of delaying traffic.

The burst size limit parameters represent the policer tolerance to traffic burstiness. However, even if the traffic is a constant flat rate below the policer bandwidth limit value and even if no bursts are present, this parameter needs to be dimensioned, because packets crossing the policer always consume credits.