Automatic Prediction of Perceptual Video Quality: Recent Trends and Research Directions - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

1

Introduction

Imagine this situation - you are given two videos, both having the same content but

one of the videos is a 'low quality' (distorted) version of the other and you are asked

to rate the low quality version vis-a-vis the original (reference) video on a scale of

(say) 1-5 (where 1 is bad and 5 is excellent). Let us further assume that we collect

a representative subset of the human populace and ask them the same question, and

instead of just asking them to rate one pair of videos, we ask them to rate a whole

set of such pairs. At the end of the day we now have a set of ratings for each of

the distorted videos, which when averaged across users gives us a number between

1-5. This number represents the mean opinion score (MOS) of that video and is a

measure of the perceptual quality of the video. The setting just described is called

subjective evaluation of video quality and the case in which the subject is shown

both the reference and the distorted video is referred to as a double stimulus study.

One could imagine many possible variations to this technique. For example, instead

of showing each video once, let us show each video twice so that in the first pass the

human 'decides' and in the second pass the human 'rates'. This is a perfectly valid

method of collecting subjective scores and along with a plethora of other techniques

forms one of the possible methods for subjective evaluation of video quality. Each

of these methods is described in a document from the International Telecommuni-

cations Union (ITU) [1] . If only we always had the time to collect a subset of the

human populace and rate each video that we wish to evaluate quality of, there would

have been no necessity for this chapter or the decades of research that has gone into

creating algorithms for this very purpose.

Algorithmic prediction of video quality is referred to as objective quality as-

sessment, and as one can imagine it is far more practical than a subjective study.

Algorithmic video quality assessment (VQA) is the focus of this chapter. Before we

delve directly into the subject matter, let us explore objective assessment just as we

did with the subjective case. Imagine you have an algorithm to predict quality of a

video. At this point it is simply a 'black-box' that outputs a number between (say)

1-5 - which in a majority of cases correlates with what a human would say. What

would you imagine the inputs to this system are? Analogous to the double stimulus

setup we described before, one could say that both the reference and distorted videos

are fed as inputs to the system - this is full reference (FR) quality assessment . If one

were to imagine practical applications of FR VQA, one would soon realize that hav-

ing a reference video is infeasible in many situations. The next logical step is then

truncating the number of inputs to our algorithm and feeding in only the distorted

video - this is no reference (NR) VQA . Does this mean that FR VQA is not an in-

teresting area for research? Surprisingly enough, the answer to this question is NO!

There are many reasons for this, and one of the primary ones is that FR VQA is an

extremely difficult problem to solve. This is majorly because our understanding of

perceptual mechanisms that form an integral part of the human visual system (HVS)

is still at a nascent stage [2, 3]. FR VQA is also interesting for another reason - it

gives us techniques and tools that may be extended to NR VQA. Since FR VQA has

High-Quality Visual Experience

Search WWH ::

Custom Search

Home