Information Technology Reference
In-Depth Information
Fig. 13 Thumbnails of
two video sequences. Left:
“Man in Restaurant”, and
right: “Street with trees
and bicycle”. Original HD
sequences
c
Warner Bros
issued from the Dolby 4-4-4
Film Content Kit One.
As explained in Section 3.1, we used GOPs of 8 consecutive frames as basic units of
video information to extract spatial and temporal descriptors for each clip. The spatial
SMP descriptors were extracted from the first frame of each GOP using five resolution
levels of the Laplacian pyramid as well as the low-frequency residual. The thresholds
were set to keep 1 / 6 of the patches at each scale, except for the lowest one where all
patches were used. The temporal descriptors were extracted using a diamond-search
block matching algorithm to estimate inter-frame motion vectors on 16
×
16.
3.3.2
Spatial Dissimilarity
We consider the task of retrieving the GOPs most similar to a query GOP. Hence all
transformed versions of the query GOP itself are expected to be ranked first by the
dissimilarity measure defined above. The dissimilarity measure D between a query
GOP G Q and a reference GOP G R as defined in Eq. (21) is a combination of a spatial
term D s taking into account only spatial features and a temporal term D t defined over
temporal features. While the spatial descriptors are essentially useful for comparing
statistical scene information of two video pieces, motion descriptors are expected to
highlight similarities based on dynamical patterns like the movement of objects or
persons in a scene. In order to appropriately choose the weighting factors
α 2
in Eq. (21), we studied the spatial and temporal parts of the measure separately first.
Firstly we considered only the spatial descriptors (
α 1 and
1 = 1 ,
2 = 0) to retrieve
similar GOPs. The SMP descriptors prove to be crucial for distinguishing GOPs
of the same video sequence as the query from those belonging to different video
sequences. The results obtained are shown in Figure 14. In this figure each curve
shows the dissimilarity between a fixed query GOP and all GOP from 2 clips of
the same sequence and one clip of a different sequence in all possible versions. The
query GOP is the first GOP of the first clip of either “Man in Restaurant” or “Street
with Bicycle an Trees”. A particular reference GOP is identified by the sequence,
clip and version indicated in the middle rectangles of the figure, and by the GOP
label on the x-axis, the 9 GOPs of a particular clip being ordered chronologically.
Even when frame transformations are applied - either rescaling and very lossy
compression - all GOPs originating from the same video sequence sequence are far
more dissimilar to the query. These results confirm that SMP descriptors are relevant
for retrieving video scenes that share overall visual similarity with a query scene,
and show in particular that the spatial part of the measure is robust to scaling and
very lossy compression (spatial scalability).
α
α
 
Search WWH ::




Custom Search