Information Technology Reference
In-Depth Information
matured over the years, we shall cease talking about it here. The interested reader is
referred to [4, 5], for tutorial chapters on FR VQA .
Thinking solely from an engineering perspective one would realize that there
exists another modality for VQA. Instead of feeding the algorithm with the reference
and distorted videos, what if we fed it the distorted video and some features from
the reference video? Can we extract features from the reference video and embed
them into the video that we are (say) transmitting? If so, at the receiver end we can
extract these reference features and use them for VQA. Such assessment of quality
is referred to as reduced-reference (RR) VQA . RR and NR techniques for VQA
form the core of this chapter.
In describing the RR technique, we have inadvertently stumbled upon the general
system description for which most algorithms described in this chapter are designed.
There exists a pristine reference video which is transmitted through a system from
the source. At the receiver, a distorted version of this video is received whose qual-
ity is to be assessed. Now, the system through which the video passes could be a
compression algorithm. In this case, as we shall see, measures of blockiness and
bluriness are used for NR VQA. In case the system is a channel that drops packets,
the effect of packet loss on quality may be evaluated. These concepts and many oth-
ers are discussed in this chapter. Before we describe recent algorithms, let us briefly
digress into how the performance of an algorithm is evaluated.
2
Performance Evaluation of Algorithms and Databases
At this stage we have some understanding of what a VQA algorithm does. We know
that the aim of VQA is to create algorithms that predict the quality of a video such
that the algorithmic prediction matches that of a human observer. For this section
let us assume that we have an algorithm which takes as input a distorted video (and
some reference features) and gives us as output a number. The range of the output
could be anything, but for this discussion, let us assume that this range is 0-1, where
a value of 0 indicates that the video is extremely bad and a value of 1 indicates that
the video is extremely good. We also assume that the scale is continuous, i.e., all
possible real-numbers between 0 and 1 are valid algorithmic scores. With this setup,
the next question one should ask is, 'How do we know if these numbers generated
are any good?'. Essentially, what is the guarantee that the algorithm is not spewing
out random numbers between 0 and 1 with no regard to the intended viewer?
The ultimate observer of a video is a human and hence his perception of quality
is of utmost importance. Hence, a set of videos are utilized for a subjective study
and the perceptual quality of the video is captured in the MOS . However, picking
(say) 10 videos and demonstrating that the algorithmic scores correlate with human
subjective perception is no good. We require that the algorithm perform well over a
wide variety of cases, and hence the database on which the algorithm is tested must
contain a broad range of distortions and a variety of content, so that the stability of its
performance may be assessed. In order to allow for a fair comparison of algorithms
that are developed by different people, it is imperative that the VQA database, along
Search WWH ::




Custom Search