Information Technology Reference
In-Depth Information
2.5 Network-Layer Support for VoIP
Several standardized and proposed transport protocols, such as TCP, UDP,
and RTP, were designed with different trade-offs between reliable delivery of
packets and application-layer end-to-end delay. By providing hooks for syn-
chronization, reliability, QoS feedback, and flow control, RTP can support
real-time multimedia applications [61-63]. Although RTP does not guarantee
the real-time delivery of continuous media, it relies on resource reservations
protocols, such as RSVP [64], for resource scheduling. RTP can be used in
conjunction with the mechanisms described in this chapter to ensure high
conversational quality. Another approach is to design end-to-end protocols
that are more TCP friendly [65,66]. These protocols will need to be extended
in order to address the trade-offs in conversational quality.
In multi-party VoIP, the current speaker(s) needs to convey his or her speech
information to all the listeners. A good design should accommodate dynamic
and diverse network conditions among the clients. The protocol and the con-
nection topology used are generally dictated by where audio mixing is done
[67]. When a VoIP client (as in Skype [68]) or a bridge (as in QQTalk [69]) is
responsible for decoding, mixing, and encoding the signals from the clients,
it is natural for all the clients to send their packets to the centralized site to
be mixed and forwarded. The approach may not be scalable because it can
create a bottleneck near that site. Moreover, the maximum end-to-end delay
(ME2ED) between any speaker-listener pair can be large when the clients are
geographically distributed [10].
On the other hand, a distributed approach asks each client to indepen-
dently manage its transmissions. One way is for each speaker to multicast
the speech information to all listeners. Although multicasts are available on
the Internet [70], the support of reliable real-time multicasts for receivers of
different loss and delay behavior is very preliminary [71]. The focus in the
IETF working group on a NACK-based asynchronous-layered coding proto-
col with FEC [72] is inadequate for multi-party VoIP.
A hybrid approach is to have an overlay network [73] that uses a subset
of the clients to manage the mixing and forwarding of unicast packets. The
approach achieves a shorter ME2ED than a centralized approach and a
smaller number of unicast messages than a fully distributed approach. One
issue, however, is that it is complex for the overlay clients to coordinate the
decoding, mixing, and encoding of the speech signals. Alternatively, we have
taken the approach for the overlay clients to simply encapsulate the speech
frames from the multiple clients into a single packet before forwarding the
frames [10]. This is not an issue as long as the number of streams to be encap-
sulated fits within the MTU of each packet.
To design a topology with proper trade-offs between ME2ED among the
clients and the maximum number of packets relayed in a single packet
period by any client, we have studied a commonly used overlay topology
 
Search WWH ::




Custom Search