Traffic Types (QOS-Enabled Networks) Part 3

QOS Conclusions for VOIP

Because voice heavily relies upon QOS, one must calculate the bandwidth that needs to be scheduled and protected (so that resources are available when necessary) to achieve the service and quality demands. The telephony world has a term for this, Counting Erlangs , named after the Danish mathematician and telephone engineer Agner Krarup Erlang, and a concept widely used in the telecommunications technology. In the PSTN (Public Switched Telephone Network) realm, bandwidth is handled by allocating times-lots. Once allocated, a timeslot cannot be reused even if it is not being used and no one is talking. The implication is that to deliver 100% service, many timeslot circuits are allocated regardless of whether they are used or not, which leads to a simple dilemma: should a network that has no resource-sharing possibilities be dimensioned to account for the phenomena of all house phones being used at the same time? Erlang studied the load on telephone circuits, looking at how many lines were required to provide an acceptable service without installing too much costly excess capacity. The formulas he developed allow calculation of the trade-off between cost and service level. The end of this topic provides references for further reading on these formulas.

Obviously, some modifications to the Erlang model are needed for VOIP, because packet-based networks do allow resource sharing. A major benefit of VOIP is that over-provisioning can be reduced, but a major drawback is that the required resources for voice (or other real- time traffic) need to be protected.


Let’s demonstrate an example of an Erlang-style model for VOIP. The three variables involved in these calculations are: Busy Hour Traffic (BHT), Blocking, and Bandwidth.

BHT is the number of hours of call traffic during the busiest hour of operation of a telephone system. This is the Erlang number. Blocking is the failure of calls because an insufficient number of lines is available. A value of 0.01 means 1 call is blocked per 100 calls attempted. Bandwidth is the amount of bandwidth, in kbps, required through an IP/ MPLS-based network to carry the traffic.

Here we focus on calculating the bandwidth required only, and let the BHT be calculated by the readers with their own user service model. Several VOIP bandwidth calculators are available for download in the Internet community.

The payload for each depends heavily on whether the codec is framed-based or sampling based (obviously, having a mix of codecs in the network is a challenge).

Let us go through the data for one session and use the most bandwidth-hungry codec, G.711, which requires 64 Kbps per call. The math behind 64Kbps follows the Nykvist theory, sampling twice the frequency of speech:

tmpC-68_thumb[2][2]

The payload conversion into packets can be specified in terms of packet delay or number of frames per packet. The normal sampling delay is 20 ms. Thus, 50 frames are needed per 1 second. The IP overhead is easy to calculate. The RTP header is 12 bytes, the UDP header is 8 bytes, and the IP header is 20 bytes. Thus, the IP and RTP overhead is 40 bytes (320 bits), and the IP/RTP overhead is 50 x 320 = 16000 bits. The throughput rate is thus 64000 + 16000 = 80000 bps for a unidirectional SIP session.

One value that most often is not calculated is RTCP. These packets are bigger than the RTP packets (as we discussed in the RTP section). Generally, RTCP represents 5% of RTP’s session bandwidth. This is a very liberal value, and in real life 1% is probably closer to the truth. Below is a summary of the raw data for a G711 SIP unidirectional call:

tmpC-69_thumb[2][2]

The media overhead depends on the media being used. For example, for Ethernet with no 802.1q VLAN header, the media overhead is 14 bytes plus a 4-byte CRC. Any VLAN adds a 4 bytes, any MPLS header adds another 4 byte, and so forth. Below is an example of the impact of media overhead on rate throughput:

tmpC-70_thumb[2][2]

Possible QOS demands beyond delay, jitter, and RTT need to be optimized using rules similar to those used by the legacy PSTN network to estimate busy-hour traffic volume. Some amount of oversubscription exists in most voice services, for example, in the mobile network. Another example is when attempting to make a call and getting the feedback message ‘please try later’ from your cell phone or fixed phone.

Many current routers strip the incoming media headers from the outbound queue, resulting in lower overhead, with the exception of MPLS stack label solutions. A Layer 2 VPN solution such as VPLS is an example in which the original media frame and an MPLS stack label are part of the packet. However, on most high-performance routers, only part of packet (the header and pointers) is in the queue. The whole packet is assembled after scheduling and queuing. The point is that with regards to overhead, the packets’ origin (e.g. MPLS VPN with stack labels) matters more than the outgoing media headers because those can be manipulated during the forwarding process.

What about delay and jitter demands? Jitter and delay are enemies of good quality. Each device has some form of jitter buffer to handle arrival variations. An old recommendation in ITU-T G114 suggests a maximum of a 150-ms end-to-end delay to achieve good quality. This value may be a little too conservative and may no longer be valid, but note that with voice it is the end-to-end delay that is critical. Thus, in a network with several congestion points, any one of them can easily ruin the whole service. Another well-known recommendation is that jitter be less than 30 ms. The designer has some leeway in setting the delay and jitter budget, which is normally driven by the demands of the specific application or equipment, or both. It is clear, however, that a VOIP packet needs VIP treatment with regards to scheduling, because there is a direct competition for resources between the VOIP packets and all other traffic types in the network.

Can a voice packet be dropped, can it have some jitter variation? The answer is, It depends. A service’s control packets cannot be dropped (unless some oversubscription is allowed, as discussed earlier), but some delay and jitter variation is acceptable, because this signaling part of VOIP very rarely affects any interaction with the end user other than the possible frustration of not being able to establish the call. However, this is not necessarily in the bigger picture of trunking of signaling for voice traffic. For example, relay SS7 signaling, such as SIGTRAN, is very delay-sensitive and must be delivered on time.

On the other hand, for the data packets, the bearer data cannot be delayed or have jitter, which would affect the quality. It is actually better to drop something than to try to reorder it, because packets can remain in a jitter/playback buffer only for a limited time. If a packet is delayed for too long, it is of no use to the destination, so it is a mistake to spend network resources in processing it.

The difference between control and data traffic is one of the Catch-22 situations with VOIP. It is interesting to go back to basics and the PSTN and ask, how did they do it? They calculate the service based on an estimate of the "average" load, from which they estimate the resources (time slots for bearer channels and out-of-band connections for the signaling). VOIP is not very different. If you sell X users and want a service for X%, you must reserve the bandwidth for it, with a path and delay buffer that have an end-to-end RTT of X.

IPTV

IPTV and VOD (Video On Demand) both deliver video content to the end user and we treat them the same here, IPTV broadcasting and VOD scenarios, because both have pretty much the same demands and needs of QOS although they differ somewhat in their service structure. Our focus here will be on QOS for IPTV delivery, and detailing the many Digital TV formats (compressed and non-compressed, MPEG-2, and MPEG-4, SDTV and HDTV, and so forth) is simply beyond the scope of this topic.

To get started, let’s examine MPEG-2 and Transport Stream (TS) delivery. TS, defined by ISO/IEC 13818-1 (ITU-T H-220.0), designates the use of MPEG-2 (and MPEG-4) transport streams for either audio or video, in packetized form. Packetized multimedia streams usually include various information in addition to the raw video and raw audio data, such as ways to identify the type of content in the packet and synchronization information to identify and order received packets. TS normally contains both sound and picture in a single stream. The TS frame (see Figure 4.13) contains multiple MPEG-2 packets, each with a payload of 184 bytes, plus a 4-byte header for the frame. Up to seven MPEG – 2 packets fit in one TS frame.

The Packet ID (PID) in the header identifies each MPEG packet, and packets in the same stream have the same PID. The receiver uses the PID to place all received packets into the correct order. Clocking synchronization parameters sent at regular intervals are used by the Program Clock Reference (PCR) to help the decoder re-create the encoder’s clock. The PCR also guarantees that the decoder output rate is the same as the encoder’s input signal.

An RTP header can also be added to a TS Frame to provide more quality and feedback capabilities on the transmission stream (see Figure 4.14). In summary, the normal packet size for a TS frame is roughly 1362 bytes, including the Ethernet media header and excluding the 802.1q VLAN header. The average rate for delivering an MPEG-2 stripper is about 3-4 Mbps. The rate is normally 30-40 packets per second (pps). The result is fairly large-sized packets on the wire, with some numbers regards to pps.

Figure 4.13 The TS Frame

Figure 4.13 The TS Frame

MPEG with Adaptation Header

Figure 4.14 MPEG with Adaptation Header

One of the most common examples of IPTV services is Streaming TV, which is essentially broadcasting over IP networks. Streaming TV is often combined with voice in a triple play package that delivers TV, telephony, and data to end user subscriber, all over an IP network. Many people consider Streaming TV/Video to be synonymous with IP multicast. In most situations, this is the truth, because traffic from a few sources destined to many receivers is delivered perfectly with multicast, which saves bandwidth and resource. Current multicast implementations are dominated by two protocols, IGMP and PIM. Diving deeply into the topic of multicast is beyond the scope of this topic.

The most common setup to deliver IPTV to end users uses a service or multicast VLAN or local replication. So, instead of delivering streaming TV on each subscriber interface, resulting in inefficient bandwidth utilization because of replication of same stream on many interfaces, the streams are delivered on one dedicated VLAN that reaches all possible users. Figure 4.15 shows an example of such a setup. Each user has a statically or dynamically created interface. All users subscribed to the IPTV service are connected to a common multicast resource VLAN.

QOS Conclusions for IPTV

The quality requirements for IPTV are very similar to those described earlier for VOIP. Both services need to have resources defined for them and neither can be dropped. From the generation that grew up with analog TV and the challenges of getting the antenna in the right position, there may be little more tolerance for a limited number of dropped frames.

Streaming TV

Figure 4.15 Streaming TV

Whether this is a generic human acceptance or a generation gap is up to the reader. However, there are several fundamental differences between IPTV and VOIP. The most obvious one is packet size. IPTV packets are bigger and are subject to an additional burst, so IPTV requires larger buffers than VOIP. The delay requirement is also different, because the decoder box or node on the egress most often has a playback buffer that can be up to several seconds long, so jitter is generally not significant if a packet arrives within the acceptable time slot. Re-ordering of packets can be solved, but because each IPTV packet is large, implementing re-ordering is not trivial. The length of the playback buffer is translated into a maximum number of packets that can be stored, a number that is likely not to be large enough if a major re-ordering of packets is necessary. Multicast delivery works best if it is hashed to use the same path rather than spraying packets across multiple parallel links. This spraying result is TS frames arriving out of order. IPTV viewers are most irritated by the delay associated with channel swapping. The speed of IGMP processing of Report and Leave messages can be increased by lowering timer values, configuring immediately-l eave functions on routers, and letting the IGMP proxy handle the IGMP Query messages. If certain double accounting is allowed, that is, allowing the subscriber/ STB to have more than one active channel for a specified period (about 1 second), the subscriber/STB can join one channel while leaving another. Most users of IPTV service are on DSL links with a pipe speed of about 8-24Mbps, so there is limited room for double active streams. However, PIM needs a certain amount of time for its function regardless of whether it is using ASM or SSM. One way to speed up the establishment of a multicast branch is use static IGMP reports on the router on the LAN to allow certain heavily used groups to always to be on. As always, an important question is which packets to classify as IPTV service packets. This is relatively straightforward. Obviously, the multicast stream is easily identified. Other protocols involved have a TTL of 1 because they are link-local routing protocols such as IGMP and PIM. Well-implemented routers should be able to protect their own routing packets. PIM and IGMP use DSCP 110000 / IP Precedence 110, which are well-known code points for control plane traffic in a packet-based network. Because the protocols are link local, they might transit third-party gear, for example, a Layer 2 switch, but the same principle applies because 802.1p code points can be used. Obviously, if we protect any traffic, the traffic to protect is the control plane traffic. Other protocols involved in IPTV can be DHCP for STB IP addressing and some HTTP or SSL/TLS traffic for images and such to the STB. But these protocols for the most part do not visibly hurt the end user other than displaying a "Downloading … Please wait" screen message. DHCP can be reissued if certain lease packet are dropped and TCP based traffic can be retransmitted, a process handled by the TCP delivery mechanism. The loss of frames seems worse to IPTV viewers than the dropping of words in a VOIP communication because of the simple fact that most of the time humans can ask each other to repeat the last sentence.

Long-lasting versus Short-lived Sessions

Up until now, this topic has focused on the differences between real-time and non-realtime traffic, as well as on analyzing traffic patterns. But there’s a third element, the duration of a traffic-flow session. From a QOS point of view, the primary requirement that the network must provide is to classify applications based on whether they need QOS.

One result of the Internet’s evolution is the constant changes to traffic characteristics, not just the increase in users and distances, but also changes to the application used. Before WWW and HTTP protocols arrived on the scene, Internet utilization was mostly UDP or TCP – based file transfers. Because of the then moderate link speeds, sessions were long-lasting. So in the early stages of the Internet, a network consisted of links that varied widely with regards to speed and MTU, and the protocols and applications had to adapt to this reality. For most UDP-based protocol applications have limited or no support for fragmentation, Instead they have to pick a packet size small enough to never run into MTU issues when traversing links. Protocols such as DNS and TFTP had their segment size limited to 512 bytes, a value that was assumed to be the MTU supported in most situations.

With the introduction of the WWW and the HTTP, the ability to link web locations, thus making information easily reachable, created a revolution in information technology and exchange. "Surfing the web" was more fun than flipping through all the channels on cable TV. But the result was dramatic in terms of the network utilization. Network usage expanded from a limited number of file transfers and a few lonely email conversations to massive numbers of millions of sessions. The change in traffic patterns was big, and the earlier dominance of long-lasting sessions gave way to massive numbers of short – lived "click – on pages" HTTP get and replies. The traffic pattern, however, was a clear push and pull model: request packets from end users and send them to server farms, and respond from the server farms back to the end users. This is illustrated in Figure 4.16.

Of course the Internet does not sleep or sit still and its contents kept evolving into things like YouTube, which allows users to upload and share video files.

Several sites offer recorded movies or television shows, resulting in a decrease in prime time TV viewers as they move to video on demand. An example, perhaps less known but just as important to the Nordic web surfer, is the Swedish state- owned public service television, at http://svtplay.se/.

Legacy User-Server model with pull of content

Figure 4.16 Legacy User-Server model with pull of content

P2P file-sharing model

Figure 4.17 P2P file-sharing model 

The uploading and downloading of large video files mark the return of long-l asting sessions, but with a new twist: many users now means many sessions because the viewing happens at the same time that the files are downloaded.

But one of the most dramatic changes with regards to traffic patterns in recent years is the peer- to – peer ( P2P ) file- sharing networking, in which end users share content with others. The result is a new traffic pattern, completely different from the earlier user-.o-server model or the hub-and-spoke model. Rather, the communication happens directly between end users. An example of such an application is Bit Torrent. With these networks, the user downloads to their computer a program that allows them to connect to the virtual sharing network. Then using this program, the user can search the shared media on other users’ computers and download content across the Internet. This creates a kind of give- and- take model, because what you download you also share, as illustrated in Figure 4.17.

Next post:

Previous post: