Spatiotemporal Video Upscaling Using Motion-Assisted Steering Kernel (MASK) Regression - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

model. A variable block size and the use of other motion models are also possible,

and are the subject of ongoing research.

4.2

Motion Estimation and Adaptive Temporal Penalization

As mentioned, motion estimation is based on the well-known Lucas and Kanade

method [23, 24], applied in a block-by-block manner as follows. Assume we com-

puted initial estimates of the local spatial and temporal derivatives. For example,

spatial derivatives may be computed using classic kernel regression or existing

derivative filtering techniques. Temporal derivatives are computed by taking the

temporal difference between pixels of the current frame and one of the neighbor-

ing frames. Let z x 1 , z x 2 and z t denote vectors containing (in lexicographical order)

derivative estimates from the pixels in a local analysis window w l associated with

the -th block in the frame. This window contains and is typically centered on the

block of pixels of interest, but may include additional pixels beyond the block (i.e.

analysis windows from neighboring blocks may overlap). A motion vector m l for

block is estimated by solving the optical flow equation [ z x 1 , z x 2 ] m + z t = 0 in

the least-squares sense. The basic Lucas and Kanade method is applied iteratively

for improved performance. As explained before, MASK uses multiple frames in a

temporal window around the current frame. For every block in the current frame, a

motion vector is computed to each of the neighboring frames in the temporal win-

dow. Hence, if the temporal window contains 4 neighboring frames in addition to

the current frame, we compute 4 motion vectors for each block in the current frame.

In practice, a wide variety of transitions/activies will occur in natural video. Some

of them are so complex that no parametric motion model matches them exactly, and

motion errors are unavoidable. When there are errors in the estimated motion vec-

tors, visually unacceptable artefacts may be introduced in the reconstructed frames

due to the motion-based processing. One way to avoid such visible artifacts in up-

scaled frames is to adapt the temporal weighting based on the correlation between

the current block and the corresponding blocks in other frames determined by the

motion vectors. That is to say, before constructing MASK weights, we compute

the reliability (

η is

to use the mean square error or mean absolute error between the block of interest

and the corresponding block in the neighboring frame towards which the motion

vector is pointing. Once the reliability of the estimated motion vector is available,

we penalize the steering kernels by a temporal kernel K t , a kernel function of

η ) of each estimated motion vector. A simple way to define

.

Fig. 7 illustrates the temporal weighting, incorporating motion reliability. Suppose

we upscale the -th block in the frame at time t using 2 previous and 2 forward

frames, and there are 4 motion vectors, m , i , between a block in the frame at t and

the 4 neighboring frames. First, we find the blocks that the motion vectors indicate

from the neighboring frames shown as y , i in Fig. 7. Then, we compute the motion

reliability based on the difference between the -th block at t and other blocks and

decide the temporal penalization for each neighboring block.

η

High-Quality Visual Experience

Search WWH ::

Custom Search

Home