Cryptography Reference
In-Depth Information
into multiple motion layers. Each layer records the motion vectors with a
specified accuracy. The lowest of these layers denotes only a rough representa-
tion of the motion vectors. The higher layers are used to refine the accuracy.
Different layers are coded independently so that the motion information can
be truncated at the layer boundary.
Due to the mismatch between the truncated motion information and the
residual data, the schemes with scalable motion information have drifting
errors. As a result, a linear model is proposed in [26] to provide a better trade-
off between the scalable representation and the rate-distortion performance.
3.3.4 The (2D+t+2D) Structure
The 2D+t structure is suggested to reduce the drawbacks of the t+2D struc-
ture such as inaccurate motion vectors at lower spatial resolutions and drifting
errors. Fig. 3.2 (c) shows a typical 2D+t+2D scheme. The 2D+t structure is
also given where the second spatial decomposition is omitted. The first 2D
spatial transform is usually a multi-level dyadic wavelet transform and is
called the pre-spatial decomposition or transform. In this structure, MCTF
is applied to each spatial sub-band generated by the pre-spatial transform.
One drawback is that the motion estimation on the low-resolution images is
not as good as the high resolution motion estimation. This is because only
low-resolution references are used.
Based on the above observations, a second spatial wavelet decomposition,
called post-spatial decomposition (or transform) is added as shown in Fig.
3.2(c) [22, 23]. The VidWav evaluation software is able to produce various
combinations of pre- and post-spatial decompositions. Three examples are
shown in Fig. 3.18. The first example, Fig. 3.18 (a), is a conventional t+2D
scheme outputs with 3 temporal levels. The resulting four temporal sub-bands
are t-LLL, t-LLH, t-LH, and t-H. A 2-level or 3-level post-spatial decompo-
sition is then applied depending on the temporal sub-band. Since the t-H
sub-band contains high-pass signals, it is reasonable to choose a smaller band
split. Fig. 3.18 (b) shows a two-level pre-spatial decomposition. Only the low-
est spatial band is further split by the post-spatial transforms. Fig. 3.18 (c)
demonstrates a possible combination based on a similar concept when the
pre-spatial transform has only two levels. This structure provides a lot of flex-
ibility in trading the temporal with the spatial coding e ciency. For example,
it enables the inter-layer prediction between the lower and higher spatial res-
olutions. This is an idea adopted by the AVC-based SVC. More sophisticated
schemes in the 2D+t+2D category are also suggested as an addition to the
VidWav software such as the Spatial-Temporal tool (STool). Here the inter-
layer prediction is applied to the lower spatial band after MCTF [24, 25].
3.3.5 Enhanced Motion Compensated Filtering
Another problem in the conventional t+2D structure is that it produces image
artifacts on low-pass temporal filtered images due to incorrect motion vectors.
Search WWH ::




Custom Search