Information Technology Reference
In-Depth Information
1
Introduction
Imagine this situation - you are given two videos, both having the same content but
one of the videos is a 'low quality' (distorted) version of the other and you are asked
to rate the low quality version vis-a-vis the original (reference) video on a scale of
(say) 1-5 (where 1 is bad and 5 is excellent). Let us further assume that we collect
a representative subset of the human populace and ask them the same question, and
instead of just asking them to rate one pair of videos, we ask them to rate a whole
set of such pairs. At the end of the day we now have a set of ratings for each of
the distorted videos, which when averaged across users gives us a number between
1-5. This number represents the mean opinion score (MOS) of that video and is a
measure of the perceptual quality of the video. The setting just described is called
subjective evaluation of video quality and the case in which the subject is shown
both the reference and the distorted video is referred to as a double stimulus study.
One could imagine many possible variations to this technique. For example, instead
of showing each video once, let us show each video twice so that in the first pass the
human 'decides' and in the second pass the human 'rates'. This is a perfectly valid
method of collecting subjective scores and along with a plethora of other techniques
forms one of the possible methods for subjective evaluation of video quality. Each
of these methods is described in a document from the International Telecommuni-
cations Union (ITU) [1] . If only we always had the time to collect a subset of the
human populace and rate each video that we wish to evaluate quality of, there would
have been no necessity for this chapter or the decades of research that has gone into
creating algorithms for this very purpose.
Algorithmic prediction of video quality is referred to as objective quality as-
sessment, and as one can imagine it is far more practical than a subjective study.
Algorithmic video quality assessment (VQA) is the focus of this chapter. Before we
delve directly into the subject matter, let us explore objective assessment just as we
did with the subjective case. Imagine you have an algorithm to predict quality of a
video. At this point it is simply a 'black-box' that outputs a number between (say)
1-5 - which in a majority of cases correlates with what a human would say. What
would you imagine the inputs to this system are? Analogous to the double stimulus
setup we described before, one could say that both the reference and distorted videos
are fed as inputs to the system - this is full reference (FR) quality assessment . If one
were to imagine practical applications of FR VQA, one would soon realize that hav-
ing a reference video is infeasible in many situations. The next logical step is then
truncating the number of inputs to our algorithm and feeding in only the distorted
video - this is no reference (NR) VQA . Does this mean that FR VQA is not an in-
teresting area for research? Surprisingly enough, the answer to this question is NO!
There are many reasons for this, and one of the primary ones is that FR VQA is an
extremely difficult problem to solve. This is majorly because our understanding of
perceptual mechanisms that form an integral part of the human visual system (HVS)
is still at a nascent stage [2, 3]. FR VQA is also interesting for another reason - it
gives us techniques and tools that may be extended to NR VQA. Since FR VQA has
Search WWH ::




Custom Search