Image Processing Reference
In-Depth Information
Fig. 6.6 Visual comparison of state-of-the-art approach (b) with the proposed approach (c). (a)
Original. (b) GBVS [ 8 ]. (c) Proposed [ 30 ]
observed that a human face can be detected wrong because of a change in lightning
condition or in the presence of multiple faces. This is shown in Fig. 6.6 , which
provides a comparison of a saliency map obtained from the proposed approach with
the GBVS saliency maps [ 8 ]. However the proposed method detects and highlights
more salient regions as compared to other approach.
6.2.5 Detection Using Depth Information
In the previous sections, the saliency computation methods are performed in the 2D
space. In the literature as well, the main contributions to saliency extraction are
focused on 2D image processing. In the context of this chapter, we are using video
sequences acquired with a multi-view camera rig [ 34 ] which captures one scene
through different points of view. Then, the content is rendered on auto-stereoscopic
screens from which the user can virtually move around the scene. In order to do so,
the multi-view renderer needs disparity maps between the different cameras. These
disparity maps computed just after the acquisition stage are transmitted jointly with
the camera views and allow the render to extrapolate none existing intermediary
views. Therefore, it means that in such a context, the disparity information is
available with the corresponding views and provides useful additional information
to compute saliency maps. Especially, during the video encoding process, the
encoders could exploit this information in order to improve the compression
performances.
In the literature, a few studies deal with the contribution of stereo disparity or
depth to saliency map extraction. In [ 28 ], a simple approach which combines
nearness and motion is proposed. In [ 32 ], the proposed approach is based on 3D
motion estimation and the more advanced and complete approaches described in
[ 12 , 39 ] are combinations of image saliency, motions saliency, and depth saliency.
In [ 33 ], a bit rate adaptation algorithm is described where the video encoder
allocates more or less bit rate following the interest or saliency level of the region.
The content used in this study comes from a video game for which a perfect depth
map is available. Based on this depth information, a simple and fast algorithm
divides the image into regions with different saliency levels and these levels are
then mapped to quantization steps. For our approach, we will transpose this work in
the multi-view context by using disparity maps in order to differentiate the image
Search WWH ::




Custom Search