Information Technology Reference
In-Depth Information
equalization. Each element of the training set, therefore, consists of a map-
ping from the system-controlled parameter values and the objective metrics
of the simulated conditions on a pair of conversations to their subjective pref-
erence. This method ensures that the conversations compared only differ by
one parameter value and that their subjective preference can be attributed to
the system-controlled value that leads to that opinion. We then learn a SVM
classifier using training data based on the results of the subjective tests and
the conditions under which the tests are conducted.
At run-time, the parameters representing the current conditions are esti-
mated and input to the SVM. For example, in the design of the POS algorithm
for two-party VoIP, loss, delay, and jitter parameters are used to represent
network conditions, and switching frequency and singe-talk duration param-
eters represent conversational conditions. The SVM learned outputs the
subjective preference for a given pair of points on the operating curve that
corresponds to the network and conversational conditions observed. Its pre-
dictions on the subjective preference between multiple pairs of points on the
same operating curve are combined using the statistical method described
earlier in order to identify the optimal MED value, which is then used by the
POS algorithm to adjust the jitter-buffer delay in order to achieve the operat-
ing point with the highest subjective quality.
2.3 Cross-Layer Speech Codecs for VoIP
Traditional codecs developed for cellular communications and PSTN calls are
not suitable for VoIP because they have been designed for circuit switching
under low bandwidth, fixed bit rates, and random bit errors. These codecs are
not effective in packet-switched networks, whose loss rates and delay jitters
are dynamic. Some recent codecs have been developed for VoIP applications.
They can encode wide-band speech and exploit trade-offs between bit rate and
delay in order to be more robust against bursty losses. However, they have been
designed without due consideration of LC strategies in other layers of the pro-
tocol stack. Without such considerations, the LC strategies in these codecs can
be inadequate and give subpar performance, or redundant and unnecessary.
In this section, we first briefly survey speech codecs designed for VoIP. We
then present the design of cross-layer speech codecs that are done in con-
junction with LC strategies in the packet-stream layer.
2.3.1 Previous Work on Speech Codecs
Speech codecs were traditionally designed for applications in cellular and
PSTN communications. With the proliferation of IP networks, they have
been increasingly used in VoIP. They can be classified based on their coding
 
Search WWH ::




Custom Search