Information Technology Reference
In-Depth Information
Video Quality Assessment based on Data Hiding. Farias et. al. proposed a VQA
algorithm based on data hiding in [27]. Even though the authors claim that the
method is NR, a method that incorporates any information at the source end which is
utilized at the receiver end for QA, as per our definition, is RR VQA. The proposed
algorithm is based on the technique of water-marking. At the source end, a 8
8
block Discrete Cosine Transform (DCT) of the frame is performed. A binary mask
is multiplied by an uncorrelated pseudo-random noise matrix, which is rescaled and
added to the medium frequency coefficients from the DCT. A block is selected for
embedding only if the amount of motion (estimated using a block motion estimation
algorithm) exceeds a certain threshold ( T mov ). At the receiver end an inverse process
is performed and the mark is extracted. The measure of degradation of the video is
then the total squared error between the original mask and the retrieved mask. The
authors do not report statistical measures of performance as discussed before.
×
Data hiding in perceptually important areas. Carli et. al. proposed a block-based
spread-spectrum method for RR VQA in [28]. The proposed method is similar to
that in [27], and only differs in selecting where to place this watermark. Regions of
perceptual importance are computed using motion information, contrast and color.
The watermark is embedded only in those areas that are perceptually important,
since degradation in these areas are far more significant. A single video at different
bit-rates is used to demonstrate performance.
Limitations to the use of watermarking include the fact that the use of squared
error (which is generally computed at the receiver as a measure of quality) does
not relate to human perception, and that the degradation of a watermark may not be
proportional to (perceptual) video degradation.
4.2
Other Techniques
Low Bandwidth RR VQA. Using features proposed by the authors in [29], Wolf
and Pinson developed a RR VQA model in [30]. A spatio-temporal (ST) region con-
sisting of 32 pixels
1 second is used to extract three features. Further
a temporal RR feature which is essentially a difference between time-staggered ST
regions is also computed. At the receiver the same set of parameters are extracted
and then a logarithmic ratio or an error ratio between the (thresholded) features is
computed. Finally a Minkowski pooling is undertaken to form a quality score for
the video. The authors claim that the added RR information contributes only about
10 kbits/s of information.
×
32 pixels
×
Multivariate Data Analysis based VQA. Oelbaum and Diepold utilized a multi-
variate data analysis approach, where the HVS is modeled as a black-box, with some
input features [31, 32]. The output of this box is the visual quality of the video. The
authors utilize previously proposed features for NR IQA including blur , blocking
and video 'detail'. They also extract noise and predictability based on simple tech-
niques. Edge, motion and color continuity form the rest of the features. Features are
Search WWH ::




Custom Search