Information Technology Reference
In-Depth Information
However, the difficulty of the motion-based approach is that, even though the mo-
tion vector field may be refined and/or smoothed, more complex transitions (e.g.
occlusions, transparency, and reflection) are not accurately treated. That is, motion
errors are inevitable even after smoothing/refining motion vector fields, and, hence,
an appropriate mechanism that takes care of the errors is necessary for producing
artifact-free outputs.
Unlike video processing algorithms which depend directly on motion vectors, in
a recent work, Protter et al. [11] proposed a video-to-video super-resolution method
without explicit motion estimation or compensation based on the idea of Non-Local
Means [12]. Although the method produces impressive spatial upscaling results even
without motion estimation, the computational load is very high due to the exhaustive
search (across space and time) for blocks similar to the block of interest. In a related
work [13], we presented a space-time video upscaling method, called 3-D iterative
steering kernel regression (3-D ISKR), in which explicit subpixel motion estima-
tion is again avoided. 3-D ISKR is an extension of 2-D steering kernel regression
(SKR) proposed in [14, 15]. SKR is closely related to bilateral filtering [16, 17] and
normalized convolution [18]. These methods can achieve accurate and robust image
reconstruction results, due to their use of robust error norms and locally adaptive
weighting functions. 2-D SKR has been applied to spatial interpolation, denoising
and deblurring [15, 18, 19]. In 3-D ISKR, instead of relying on motion vectors, the
3-D kernel captures local spatial and temporal orientations based on local covari-
ance matrices of gradients of video data. With the adaptive kernel, the method is
capable of upscaling video with complex motion both in space and time.
In this chapter, we build upon the 2-D steering kernel regression framework
proposed in [14], and develop a spatiotemporal (3-D) framework for processing
video. Specifically, we propose an approach we call motion-assisted steering kernel
(MASK) regression. The MASK function is a 3-D kernel, however, unlike as in 3-D
ISKR, the kernel function takes spatial (2-D) orientation and the local motion tra-
jectory into account separately, and it utilizes an analysis of the local orientation and
local motion vector to steer spatiotemporal regression kernels. Subsequently, local
kernel regression is applied to compute weighted least-squares optimal pixel esti-
mates. Although 2-D kernel regression has been applied to achieve super-resolution
reconstruction through fusion of multiple pre-registered frames on to a 2-D plane
[14, 18], the proposed method is different in that it does not require explicit mo-
tion compensation of the video frames. Instead, we use 3-D weighting kernels that
are “warped” according to estimated motion vectors, such that the regression pro-
cess acts directly upon the video data. Although we consider local motion vectors
in MASK, we propose an algorithm that is robust against errors in the estimated
motion field. Prior multi-frame resolution-enhanced or super-resolution (SR) recon-
struction methods ([2, 3]) often consider only global translational or affine motions;
local motion and object occlusions are often not addressed. Many SR methods re-
quire explicit motion compensation, which may involve interpolation or rounding of
displacements to grid locations. These issues can have a negative impact on accuracy
and robustness. Our proposed method is capable of handling local motions, avoids
explicit motion compensation, and is more robust. The proposed MASK approach is
Search WWH ::




Custom Search