Spatiotemporal Video Upscaling Using Motion-Assisted Steering Kernel (MASK) Regression - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

the third frame (spatial upscaling). When estimating the pixel value at x =[ x 1 , x 2 , t ],

where t = t 3 , first we compute 2-D steering kernel weights for each frame, as il-

lustrated in Fig. 5(a-i), using the first Gaussian kernel function in (28). Motions are

not taken into account at this stage. Second, having motion vectors, m i ,whichwe

estimate using the optical flow technique with the translational motion model and

the frame at t i =3 as the anchor frame, we shift the steering kernels for each frame

by m i as illustrated in Fig. 5(a-ii). Finally, as in Fig. 5(a-iii), the temporal kernel

function penalizes the shifted steering kernels so that we give high weights to closer

neighboring frames.

Local steering parameters and spatio-temporal weights are estimated at each

pixel location x i in a small region of support for the final regression step. Once

the MASK weights are available, similar to the 2-D case, we plug them into (11),

compute the equivalent kernel W N , and then estimate the missing pixels and denoise

the given samples from the local input samples ( y i ) around the position of inter-

est x . Similar to (12), the final spatio-temporal regression step can be expressed as

follows:

P

i =1 W i ( x ; H i , H i , h t , K , N ) y i .

z ( x )=

(29)

The MASK approach is also capable of upscaling video temporally (also called

frame interpolation or frame rate upconversion). Fig. 5(b) illustrates the MASK

weights for estimating an intermediate frame at sometime between t 3 and t 4 . Fun-

damentally, following the same procedure as described in Figs. 5(a-i)-(a-iii), we

generate MASK weights. However, for the motion vector with the unknown inter-

mediate frame as the anchor frame, we assume that the motion between the frames at

t 3 and t 4 is constant, and using the motion vectors, m i =1 ,··· , 5 , we linearly interpolate

motion vectors m i as

m i = m i + m 4 ( t

−

t 3 ) .

(30)

Note that when m 4 is inaccurate, the interpolated motion vectors for other frames

in the temporal window ( m i ) are also inaccurate. In that case, we would shift the

kernel toward the wrong direction, and the MASK weights would be less effective

for temporal upscaling. Therefore, one should incorporate a test of the reliability of

m 4 into the process, and use vectors m i instead of m i if it is found to be unreliable.

Our specific technique to compute the reliability of motion vectors is described in

Section 4.2.

4

A Practical Video Upscaling Algorithm Based on MASK

In this section, we describe a complete algorithm for spatial upscaling, denoising

and enhancement, as well as temporal frame interpolation, based on the MASK ap-

proach. We introduce several techniques that enable a practical implementation of

the MASK principles explained in the previous section. In particular, we develop

an algorithm with reduced computational complexity and reduced memory require-

ments, that is suitable for both software and hardware implementation.

High-Quality Visual Experience

Search WWH ::

Custom Search

Home