Information Technology Reference
In-Depth Information
Clear preference: A>C
C
B
Clear preference: A>E
A
D
(A vs. B)
(A vs. D)
E
Barely noticable
Comparison against A needed
Comparison against A not needed
MED
Figure 2.8
JND helps reduce the number of subjective evaluations.
we have carried out simulated conversations and have conducted pair-wise
subjective tests to capture the relative user preferences.
It is difficult to find the best operating point on an operating curve by com-
paring MOS scores because there are infinitely many such scores to be evalu-
ated. Moreover, some operating points do not have to be assessed to the same
confidence level when they are obviously inferior or unnecessary. To this end,
we have developed a statistical method that minimizes the total number of
tests for finding an operating point with the best subjective conversational
quality to within a prescribed confidence level [20]. This is possible because
statistical tests for comparing two conversations by human subjects follow a
multinomial distribution.
When comparing two points on an operating curve, we have developed
axioms on reflexivity, independence, identical statistical distribution, symme-
try, indistinguishability, incomparability, and subjective preference. By using
the axioms, we have constructed a general model for comparing two points on
an operating curve that allows us to determine a likely direction on the loca-
tion of the local optimum of subjective preference. However, the nonparamet-
ric nature of the model makes it difficult to combine the result of a test with
the prior information obtained. Hence, we have also developed a parametric
model of subjective comparisons after simplifying the general model. The
simple model allows a probabilistic representation of our knowledge on the
location of the local optimum and a way to statistically combine the deductions
from multiple comparisons [20]. It also allows us to develop an adaptive search
algorithm that significantly reduces the number of comparisons needed for
identifying the local optimum. In addition, an estimate on the confidence of
the result provides a consistent stopping condition for our algorithm.
Our results show that sequential evaluations of a single operating curve
are the most effective in terms of minimizing the number of tests performed
 
Search WWH ::




Custom Search