Advanced Queuing Topics (QOS-Enabled Networks) Part 2

Using RED with TCP Sessions

As discussed earlier, TCP protocols have the concept of rate optimization windows, such as congestion windows and receiving windows, which can be translated into the volume of traffic that can be transmitted or received before the sender has to wait to receive an acknowledgment. Once an acknowledgment is received, data equal to the window size accepted on the receiver’s and the sender’s congestion windows, minus the still unacknowledged data, can be transmitted.

Figure 8.10 TCP congestion mechanism

But let us recap briefly: The source and destination always try to have the biggest window size possible to improve the efficiency of the communication. The session starts with a conservative window size and progressively tries to increase it. However, if traffic is lost, the size of the sender’s and receiver’s congestion windows are reduced, and traffic whose receipt was not acknowledged is retransmitted. The window size reduction is controlled by the TCP back-off mechanism, and the reduction is proportional to the amount of the loss, with a bigger loss leading to a more aggressive reduction in the window size. A severe traffic loss with timeouts causes the window size to revert to its original value and triggers a TCP slow-start stage (Figure 8.10).

The question is, how can RED usage improve this behavior? First, let us identify the problem clearly: massive traffic loss leads to a drastic reduction in window size. When is traffic loss massive? When the queue is full.

All active TCP sessions simultaneously implement their TCP back- off algorithms, lowering their congestion window sizes and reducing the volume of offered traffic, thus mitigating the congestion. Because all the sessions then start to increase their congestion window sizes roughly in unison, a congestion point is again reached and all the sessions again implement the back- off algorithm, thus entering a loop situation between both states. This symptom is most often referred as TCP synchronization.

In this situation, RED can be used to start discarding traffic in a progressive manner before the queue becomes full. Assume that as the queue reaches a fill level close to 100%, a drop probability of X% is defined. This translates into "The queue is close to getting full, so out of 100 packets, drop X." Thus, before reaching the stage in which the queue is full and massive amounts of traffic start to be dropped, we start to discard some packets. This process provides time to allow the TCP sessions to adjust their congestion window sizes instead of reverting to their initial congestion window sizes to the slow-start stage value.

That’s the theory. But does RED work? There has been a debate about how useful RED has been in modern networks. The original idea design by Floyd and Jacobson focused on long-lived TCP sessions. For example, a typical file transfer, such as an FTP session, reacts by lowering the congestion window size when it receives three or more duplicate ACKs. Users are rarely aware of the event unless the session enters the slow-start phase or stalls. But the explosion of HTTP has surfaced questions about how RED can affect short-lived sessions. A web session often just resends packets after a timeout to avoid having the end user think that there were service problems or having them just click more on the web page.

It is worth discussing some basics at this point. First, for RED to be effective and keep the congestion window within acceptable limits, sessions must be long-lasting. That is, they need to be active long enough to build up a true congestion window. By simply dropping sessions with limited congestion windows in response to HTTP continuation packets, RED probably cannot be effective with regards to tweaking the TCP congestion mechanism. Sessions must have a fair number of packets in the buffer. The more packets a specific session has in the buffer when the congestion is large, the greater the chance that the session experiences drops. Second, modern TCP stacks are actually very tolerant to drops. Features such as selective acknowledgment (SACK), fast retransmit, and fast recovery allow a session to recover very quickly from a few drops, by triggering retransmission and rapidly building up the congestion window again.

The trend now is towards long-lasting sessions, which are used commonly for downloading streamed media, "webcasting," and point-to-point (P2P) file sharing. At the same time, current TCP stacks can take some beating for a short period without significantly slowing down other sessions. To ramp down sessions, either many sessions must be competing, thus preventing any build-up of congestion windows, or the RED profiles must be very aggressive and kick in relatively early before congestion builds up in the queue. If the "sessions" were UDP based, RED would not be effective at all except possibly to shape the traffic volume, which is likely outside the original RED design. If RED profiles just provide alternate profiles for handling how long such packets can stay in the buffer, RED can be used for UDP traffic by applying several tail drop lengths. But again, this use does not follow the original definition of RED.

Differentiating Traffic Inside a Queue with WRED

When different types of traffic share the same queue, it is sometimes necessary to differentiate between them.Here, one method used frequently is weighted RED (WRED). WRED is no different than applying different RED profiles for each code point (see Figure 8.11).

To provide a concrete scenario, suppose in- contract and out- of- contract traffic are queued together. While queuing out-of-contract traffic can be acceptable (depending on the service agreement), what is not acceptable is for out-of-contract traffic to have access to queuing resources at the expense of dropping in-contract traffic.

Figure 8.11 Multiple RED drop levels in the same queue

Figure 8.12 Aggressive RED drop levels for out-of-contract traffic

Avoiding improper resource allocation can be done by applying an aggressive RED profile to out-of-contract traffic, which ensures that while the queue fill level is low, both in-contract and out-of-contract traffic have access to the same resources. However, when the queue fill level increases beyond a certain point, queuing resources are reserved for in-contract traffic and out-of-contract traffic is dropped. Figure 8.12 illustrates this design for an architecture that drops packets from the head of the queue.

In effect, this design creates two queues in the same physical queue. In this example, packets that match profile 1, which are marked with a dark color, are likely to be dropped by RED when they enter the queue, and they are all dropped when queue buffer utilization exceeds 50%. Packets matching profile 2, which are shown as white, have a roller-coaster ride until the queue is 50% full, at which point RED starts to drop white packets as well.

The classification of out-of-contract packets can be based on a multifield (MF) classifier or can be set as part of a behavior aggregate (BA). For example, the service of a best-effort (BE) service queue is lower than that for best-effort (LBE). If ingress traffic exceeds a certain rate, the classification changes the code point from BE to LBE, effectively applying a more aggressive RED profile. Typically, in-contract and out-of-contract traffic share the same queue, because using a different queue for each may cause packet-reordering issues at the destination. Even if RED design mostly focuses on TCP, which has mechanism to handle possible packet reordering, having the two types of traffic use different queues can cause unwanted volumes of retransmissions and unwanted amounts of delay, jitter, and buffering on the destination node.

An alternative design for having multiple traffic profiles in the same queue is to move traffic to another queue when the rate exceeds a certain traffic volume. The most common scenario for this design is the one discussed earlier, in which badly behaved best-effort traffic is classified as LBE. This design, of course, has both benefits and limitations. The benefits are that less complex RED profiles are needed for the queue and that the buffer for best-effort in-contract traffic can use the entire length of the buffer. The limitations are that more queues are needed and thus the available buffer space is shared with more queues, and that traffic can be severely re-ordered and jitter increased compare with having the traffic in the same physical buffer, resulting in more retransmissions. Current end-user TCP stacks can reorder packets into the proper order without much impact on resources, but this is not possible for UDP packets. So, applying RED profiles to UDP is a poor design.

Head versus Tail RED

While the specifics of the RED operation depends on the platform and the vendor, there are two general types that are worth describing: head and tail RED. The difference between them is where the RED operation happens, at the head of the queue or at the tail. As with most things, there are upsides and downsides to both implementations. Before discussing this, it is important to note that head and tail RED have nothing to do with queue tail drops or packet aging (queue head drops) described in earlier topics. RED drops occur independently of any queue drops because either the buffer is full or packets are about to be aged out.

With head RED, all packets are queued and when a packet reaches the head of queue, a decision is made about whether to drop it. The downside is that packets dropped by head RED are placed in the queue and travel through it and are dropped only when they reach the head of the queue.

Tail RED exhibits the opposite behavior. The packet is inspected at the tail of the queue, where the decision to drop it is made. Packets marked by tail RED are never queued. The implementation of the RED algorithm for both head and tail RED is not very different. Both monitor the queue depth, and both decide whether to drop packets and at what rate to drop them. The difference between the two is how they behave.

Head-based RED is good for using high-speed links efficiently because RED is applied to packets in the queue. Utilization of the link and its buffer are thereby maximized, and short bursts can be handled easily without the possibility of dropping packets.

Figure 8.13 Head-based RED

Figure 8.14 Tail-based RED

On low-speed links, head- based RED has drawbacks because packets that are selected to be dropped, are in the way of packets that are not being dropped. This head-of-line blocking situation means that packets not selected to be dropped must stay in the queue longer, a situation that might affect the available queue depth and might result in unintended packets being dropped. To avoid this situation, the RED profiles in the same queue need to be considerably different to effect more aggressive behavior (see Figure 8.13).

On the other hand, tail-based RED monitors the queue depth and calculates the average queue length of the stream inbound to the queue. Tail-based RED is good for low-speed links because it avoids the head-of-line blocking situation. However, packet dropping can be too aggressive, with the result that the link and buffer might not be fully utilized. To avoid RED being too aggressive, profiles must be configured to allow something of a burst before starting to drop packets (see Figure 8.14).

Figure 8.15 Bad news should travel fast

One more academic reason also favors head-based RED over tail-based RED: in congestion scenarios, the destination is made aware of the problem sooner, following the maxim that bad news should travel fast. Consider the scenario in Figure 8.15.

In Figure 8.15, assume congestion is occurring and packets need to be dropped. Head RED operates at the queue head, so it drops P1, immediately making the destination aware of the congestion because P1 never arrives. Tail RED operates at the queue’s tail, so when congestion occurs, it stops packet P5 from being placed in the queue and drops it. So the destination receives P1, P2, P3 and P4, and only when P5 is missed is the destination made aware of congestion. Assuming a queue length of 200 milliseconds as in this example, the difference is that the destination is made aware of congestion 200 milliseconds sooner with head RED. For modern TCP-based traffic with more intelligent retransmission features, this problem is more academic in nature. The receiver triggers, for example, SACK feedback in the ACK messages to inform the sender about the missing segment. This makes TCP implementations less sensitive to segment reordering.

Segmented and Interpolated RED Profiles

A RED profile consists of a list of pairs, where each pair consists of a queue fill level and the associated drop probability. On an X-Y axis, each pair is represented by a single point. Segmented and interpolated RED are two different methods for connecting the dots (see Figure 8.16). However, both share the same basic principle of operation: first, calculate an average queue length, then generate a drop probability factor.

Segmented RED profiles assume that the drop probability remains unchanged from the fill level at which it was defined, right up to the fill level at which a different drop probability is specified. The result is a stepped behavior.

The design of interpolated RED profiles is based on exponential weight to dynamically create a drop probability curve. Because implementations of RED are, like most things with QOS, vendor-specific, different configurations are needed to arrive at the same drop probability X based on the average queue length Y.

Figure 8.16 Interpolated and segmented RED profiles

One implementation example creates the drop probability based on the input of a minimum and a maximum threshold. Once the maximum threshold is reached, the tail drop rate is defined by a probability weight. Thus, the drop curve is generated based on the average queue length (see Figure 8.17).

A variation of this implementation is to replace the weight with averages that are computed as percentages of discards or drops compared with average queue lengths (see Figure 8.18).

What is the best approach? Well, it is all about being able to generate a configuration that performs with predicted behavior, while at same time being flexible enough to change in case new demands arise.

Extrapolating this theoretical exercise to real life, it is the case that most implementations have restrictions. Even if you have an interpolated design with generated exponential length and drop values, there are limitations on the ranges of measured points. For example, there is always a limit on the number of status ranges of elements or dots: 8, 16, 32, or 64, depending on the vendor implementation and hardware capabilities. Also, there are always rounding and averages involved in the delta status used to generate the discards as compared with the queue length ratio.

In a nutshell, there are limited proofs to show any differences between segmented and interpolate implementations. In the end, what is more important is to be able to configure several profiles that apply to different traffic control scenarios and that can be implemented on several queues or on the same queue.

Conclusion

So, does RED work in today’s networks? The original version of RED was designed with an eye on long- l asting TCP sessions, for example, file transfers. By dropping packets randomly, RED would provide hosts with the opportunity to lower the cwnd when they received duplicate ACKs. With the explosion of web surfing, there has been an increase in the number of short-lived sessions. RED is not very efficient for these short web trans-actions, because a web session often just resends packets after a timeout as congestion mechanism. But also, as discussed in topic,with the recent file sharing and downloading of streamed media, long-lasting sessions have returned to some extent. So is RED useful now? The answer is probably yes, but a weak yes.

Figure 8.17 Interpolated RED drop curve example 1

Figure 8.18 Interpolated RED drop curve example 2

The value of using RED profiles for TCP flows is questionable in environments that use the latest TCP implementations, in which the TCP stacks are able to take a considerable beating, as described in topic.Features such as SACK, which in each ACK message can provide pointers to specific lost segments, help TCP sessions maintain a high rate of delivery, thus greatly reducing the need for massive amounts of retransmission if a single segment arrives out of order. The result is that any RED drop levels need to be aggressive to slow down a single host; that is, dropping one segment is not enough.

Because RED is not aware of sessions, it cannot punish individual "bad guys" that ramp up the cwnd and rate and that take bandwidth from others. The whole design idea behind RED is that many packets need to be in the queue belonging to certain sessions, thereby increasing the chance of dropping packets for these bandwidth-greedy sessions.

However, RED is still a very powerful tool for differentiating between different traffic types inside the queue itself, for example, if packets have different DSCP markings.