Digital Signal Processing Reference
In-Depth Information
Fig. 3
Representations of a 3D scene: (
a
) epipolar image, (
b
) side-by-side stereoscopic pair, (
c
)
2D
+
Z image pair, and (
d
)mesh
combine all observations in a single bitmap. For a stereoscopic image, both views
approach is to encode the differences between the observations similarly to the
way temporal similarities are encoded in a video file as done in MPEG-4 MVC
description.
The second group of scene representations is
video-plus-depth
where each pixel
is augmented with information of its distance from the camera. A straightforward
way to represent video-plus-depth is to encode the depth map as a grey scale picture
and place the 2D image and its depth map side-by-side. The intensity of each depth
map pixel represents the depth of the corresponding pixel from the 2D image. Such
format is sometimes referred to as
2D
Z
and an example of this representation
virtual views based on the geometrical information about the scene encoded in the
depth map. Thus, it is suitable for multiview displays and can be used regardless
plus-depth can be efficiently compressed. Recently, MPEG specified a container
rendering scene observations using 2D
+
Z description requires disocclusion filling,
which can introduce artifacts. This is being addressed by using layered depth images
captured directly but can be derived from multiview images (using depth estimation
algorithms) or from point cloud data captured by range sensors. In the case of a
synthetic 3D scene, obtaining a dense depth map is a straightforward process as
solving the occlusions during rendering requires calculation of the distance between
The third group of representations store scene geometry in a vectorized form.
synthetic content since synthetic 3D scenes are described in terms of shapes and
+