Spatiotemporal Video Upscaling Using Motion-Assisted Steering Kernel (MASK) Regression - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

120[

] are becoming available. Such displays may exceed the high-

est spatial resolution and frame rate of video content commonly available, namely

1080

] and 240[

Hz

] progression High Definition (HD) video, in consumer applica-

tions such as HD broadcast TV and Blu-ray Disc. In such (and other) applications,

the goal for spatial and temporal video interpolation reconstruction is to enhance

the resolution of the input video in a manner that is visually pleasing and artifact-

free. Common visual artifacts that may occur in spatial and temporal interpolation

are: edge jaggedness, ringing, blurring of edges and texture detail, as well as mo-

tion blur and judder. In addition, the input video usually contains noise and other

artifacts, e.g. caused by compression. Due to increasing sizes of modern video dis-

plays, as well as incorporation of new display technologies (e.g. higher brightness,

wider color gamut), artifacts in the input video and those introduced by scaling are

amplified, and become more visible than with past display technologies. High qual-

ity video upscaling requires resolution enhancement and sharpness enhancement as

well as noise and compression artifact reduction.

A common approach for spatial image and video upscaling is to use linear filters

with compact support, such as from the family of cubic filters [1]. In this chap-

ter, our focus is on multi-frame methods, which enable resolution enhancement

in spatial upscaling, and allow temporal frame interpolation (frame rate upconver-

sion). Although many algorithms have been proposed for image and video interpo-

lation, spatial upscaling and frame interpolation (temporal upscaling) are generally

treated separately. The conventional super-resolution technique for spatial upscaling

consists of image reconstruction from irregularly sampled pixels, provided by reg-

istering multiple low resolution frames onto a high resolution grid using motion es-

timation, see [2, 3] for overviews. A recent work by Narayanan et al. ([4]) proposed

a video-to-video super resolution algorithm using a partition filtering technique, in

which local image structures are classified into vertical, horizontal, and diagonal

edges, textures, and flat areas by vector quantization [5] (involving off-line learn-

ing), and prepare a suitable filter for each structure class beforehand. Then, with

the partition filter, they interpolate the missing pixels and recover a high resolution

video frame. Another recent approach in [6] uses an adaptive Wiener filter and has

a low computational complexity when using a global translational motion model.

This is typical for many conventional super-resolution methods, which as a result

often don't consider more complex motion.

For temporal upscaling, a technique called motion compensated frame interpo-

lation is popular. In [7], Fujiwara et al. extract motion vectors from a compressed

video stream for motion compensation. However, these motion vectors are often

unreliable; hence they refine the motion vectors by the block matching approach

with variable-size blocks. Similar to Fujiwara's work, in [8], Huang et al. proposed

another refinement approach for motion vectors. Using the motion reliability com-

puted from prediction errors of neighboring frames, they smooth the motion vector

field by employing a vector median filter with weights decided based on the local

motion reliability. In [9, 10], instead of refining the motion vector field, Kang et al.

and Choi et al. proposed block matching motion estimation with overlapped and

variable-size block technique in order to estimate motion as accurately as possible.

×

1920, 60[

Hz

High-Quality Visual Experience

Search WWH ::

Custom Search

Home