Information Technology Reference
In-Depth Information
There are several shortcomings of this approach for evaluating VoIP. Firstly,
when completing a task and evaluating the quality of a conversation simul-
taneously, the cognitive attention required for both may interfere with each
other. Secondly, the type and complexity of the task affects the quality per-
ception. Tasks requiring faster turn taking can be more adversely affected by
transmission delays than others. Thirdly, there is no reference in subjective
evaluations, and ACR highly depends on the expertise of the subjects. Lastly,
the results are hard to repeat, even for the same subjects and the same task.
In the Nippon Telegraph and Telephone Corp. (NTT) study [7] discussed
earlier, subjective conversational experiments were conducted between two
parties using a voice system with adjustable delays. Since the study did not
consider the effect of losses and variations in delay, it is not applicable for
VoIP systems.
ITU-T Study Group [10] has realized a lack of methods for evaluating con-
versational speech quality in networks and is currently conducting a study.
However, it is not clear if the study will lead to an objective or a subjective
methodology and whether the results can help design better VoIP systems.
2.2.2 evaluations-generalization of Conversational Quality
Testbed for Evaluating VoIP Systems. We have developed a testbed for
emulating two-party [4] and multi-party [10] VoIP. This entails the collection
of Internet packet traces and multi-party interactive conversations and the
design of a system to replay these traces and conversations. The prototype
allows subjective tests to be repeated for different VoIP systems under identi-
cal network and conversational conditions [18].
The prototype consists of multiple computers, each running the VoIP cli-
ent software, and a Linux router for emulating the real-time network traffic
[4,10]. We have modified the kernel of the router in order to intercept all UDP
packets carrying encoded speech packets between any two clients. The router
runs a troll program that drops or delays intercepted packets in each direction
according to packet traces collected in the PlanetLab. We have also developed
a human-response-simulator (HRS) that runs on each end-client. The HRSs
simulate a conversation with prerecorded speech segments by taking turns
speaking their respective segments. We use a software interface to digitally
transfer the waveforms to and from the clients without quality loss.
Subjective Evaluations of Four VoIP Systems. We have compared four,
two-party VoIP clients: Skype (3.6), Google-Talk (beta), Windows Live
Messenger (8.1), and Yahoo Messenger (8.1) [18]. Using conversations recorded
by our testbed under some network and conversational conditions, human
subjects were asked to comparatively evaluate two conversations by the CCR
scale. The tests were conducted using six Internet traces under different net-
work conditions and an additional trace representing an ideal condition with
no loss and delay. We use three distinct conversations of different single-talk
durations, HRD, and switching frequencies. The subjective test results in
Search WWH ::




Custom Search