Information Technology Reference
In-Depth Information
Retransmissions of speech frames after the detection of a network loss is
infeasible in real-time VoIP, due to the excessive delays involved and their
effects on MED.
Nonredundant LC schemes are generally based on the interleaving of
frames during packetization [59]. One way is to exploit the fact that shorter
distortions are less likely to be perceived, and to break an otherwise long seg-
ment into several shorter segments that are close by but not consecutive. This
is not strictly an LC technique because it does not actually recover losses.
Another way is MDC [47,52,53] that generates multiple descriptions with
correlated information from the original speech data. This may be hard in
low bit-rate streams whose correlated information has been largely removed
during coding [47]. Another disadvantage is that the receiver will incur a
longer MED when waiting for all the descriptions to arrive before declaring
a description is lost.
Redundant LC schemes exploit trade-offs among the redundancy level, the
delay for recovering losses from the redundant information, and the quality
of the reconstructed speech. They work on the Internet because increases in
packet size, as long as they are less than the MTU [45], do not lead to notice-
able increases in the loss rate [36]. They consist of schemes that use partial
and full redundancies. Examples employing partial redundancies include
layered coding [31-33], unequal error protection (UEP) [37], and redundant
MDC [38]. Examples employing full redundancies include FEC (forward
error correction) [9,34] and redundant piggybacking [35,36]. An FEC-based
LC scheme [15] for VoIP incorporates into its optimization metric the addi-
tional delay incurred due to redundancy. In our previous work, we have
used piggybacking as a simple yet effective technique for sending copies of
previously sent frames together with new frames in the same packet, with-
out increasing the packet rate [4,10,36].
The main difficulty of using redundant LC schemes is that it is hard to
know a suitable redundancy level. Its dynamic adaption to network condi-
tions may either be too slow, as in Skype [36], or too conservative [4]. Another
consideration is that the redundancy level is application-dependent. Fully
redundant piggybacking is suitable in two-party VoIP, but partial redun-
dancy may need to be used in multi-party VoIP when speech frames from
multiple clients are encapsulated in the same packet.
Figure 2.9b also summarizes the various POS methods. Due to nonsta-
tionary and path-dependent delays and losses, simple schemes with fixed
MEDs either hardcoded at design time or during call establishment do not
provide consistent protection against late losses. Adaptive POS schemes
that adjust the playout schedule at the talk spurt or the packet level are
more prevalent.
At the talk-spurt level, silence segments can be added or omitted at the
beginning of a talk spurt in order to make the changes virtually imperceptible
to the listener. Adjustments can also be made for each frame using time-scale
modification [40] that stretches or compresses frames without changing its
Search WWH ::




Custom Search