Automatic Prediction of Perceptual Video Quality: Recent Trends and Research Directions - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

matured over the years, we shall cease talking about it here. The interested reader is

referred to [4, 5], for tutorial chapters on FR VQA .

Thinking solely from an engineering perspective one would realize that there

exists another modality for VQA. Instead of feeding the algorithm with the reference

and distorted videos, what if we fed it the distorted video and some features from

the reference video? Can we extract features from the reference video and embed

them into the video that we are (say) transmitting? If so, at the receiver end we can

extract these reference features and use them for VQA. Such assessment of quality

is referred to as reduced-reference (RR) VQA . RR and NR techniques for VQA

form the core of this chapter.

In describing the RR technique, we have inadvertently stumbled upon the general

system description for which most algorithms described in this chapter are designed.

There exists a pristine reference video which is transmitted through a system from

the source. At the receiver, a distorted version of this video is received whose qual-

ity is to be assessed. Now, the system through which the video passes could be a

compression algorithm. In this case, as we shall see, measures of blockiness and

bluriness are used for NR VQA. In case the system is a channel that drops packets,

the effect of packet loss on quality may be evaluated. These concepts and many oth-

ers are discussed in this chapter. Before we describe recent algorithms, let us briefly

digress into how the performance of an algorithm is evaluated.

2

Performance Evaluation of Algorithms and Databases

At this stage we have some understanding of what a VQA algorithm does. We know

that the aim of VQA is to create algorithms that predict the quality of a video such

that the algorithmic prediction matches that of a human observer. For this section

let us assume that we have an algorithm which takes as input a distorted video (and

some reference features) and gives us as output a number. The range of the output

could be anything, but for this discussion, let us assume that this range is 0-1, where

a value of 0 indicates that the video is extremely bad and a value of 1 indicates that

the video is extremely good. We also assume that the scale is continuous, i.e., all

possible real-numbers between 0 and 1 are valid algorithmic scores. With this setup,

the next question one should ask is, 'How do we know if these numbers generated

are any good?'. Essentially, what is the guarantee that the algorithm is not spewing

out random numbers between 0 and 1 with no regard to the intended viewer?

The ultimate observer of a video is a human and hence his perception of quality

is of utmost importance. Hence, a set of videos are utilized for a subjective study

and the perceptual quality of the video is captured in the MOS . However, picking

(say) 10 videos and demonstrating that the algorithmic scores correlate with human

subjective perception is no good. We require that the algorithm perform well over a

wide variety of cases, and hence the database on which the algorithm is tested must

contain a broad range of distortions and a variety of content, so that the stability of its

performance may be assessed. In order to allow for a fair comparison of algorithms

that are developed by different people, it is imperative that the VQA database, along

High-Quality Visual Experience

Search WWH ::

Custom Search

Home