The Design of VoIP Systems with High Perceptual Conversational Quality - Ubiquitous Multimedia Computing - page 50

Information Technology Reference

In-Depth Information

Pair-wise subjective comparisons of VolP systems

1

Skype vs Google

Skype vs Yahoo

Skype vs Windows

Google vs Yahoo

Google vs Windows

Yahoo vs Windows

0.8

0.6

0.4

0.2

-3

-2

-1 0

Average score in CCR rating

1

2

3

Figure 2.6

Distribution of pair-wise subjective scores of four VoIP systems.

Figure 2.6 illustrate that Windows Live is preferred over the others. These

are consistent with the objective metrics in terms of PESQ, CS, and CE (not

shown). Similar tests have also been conducted to compare the multi-party

version of Skype and our proposed system [10].

Statistical off-Line Subjective Tests. We have studied the statisti-

cal scheduling of off-line subjective tests for evaluating alternative con-

trol schemes in real-time multimedia applications. These applications are

characterized by multiple counteracting objective quality metrics (such as

delay and signal quality) that can be affected by various control schemes.

However, the trade-offs among these metrics with respect to the subjective

preferences of users are not defined. As a result, it is difficult to select the

proper control parameter value(s) that leads to the best subjective quality

at run time. Since subjective tests are expensive to conduct and the number

of possible control values and run-time conditions is prohibitively large, it

is important that a minimum number of such tests be conducted off-line,

and that the results learned can be generalized to unseen conditions with

statistical confidence. To this end, we have developed efficient algorithms for

scheduling a sequence of subjective tests under given conditions. Our goal is

to minimize the number of subjective tests needed in order to determine the

best point for operating the multimedia system to within some prescribed

level of statistical confidence. A secondary goal is to efficiently schedule sub-

jective tests under a multitude of operating conditions. Its success is based

on the fact that humans can differentiate two such conversations when they

are beyond the JND, aka difference limen [19]. Here JND is a difference in

the physical sensory input that results in the detection of the change 50 per-

cent of the time.

Next Page

Ubiquitous Multimedia Computing

Search WWH ::

Custom Search

Home