Digital Signal Processing Reference
In-Depth Information
To segment attention object successfully, it is important to measure the saliency
from the images/video accurately. The well-known visual attention model is called
Itti model, which was developed for rapid scene analysis by combining multiscale
image features into a single topographical saliency map [ 47 ]. A dynamical neural
network was used to select attended locations from the saliency map. This work
presented a conceptually simple computational model for saliency-driven focal
visual attention, which also included some basic concepts for producing some of
the performance of primate visual systems, such as center-surround operation and
multiscale saliency model. This model was successfully applied to object extraction
from color images [ 48 ], which formulated the attention objects as a Markov random
field by integrating computational visual attention mechanisms with attention object
growing techniques. In order to extract visual attention effectively, a lot of methods
have been presented recently to deal with salient points detection, such as frequency
tuned saliency (FTA) [ 49 ], spectral residual saliency [ 50 ], site entropy rate [ 51 ], and
context-aware saliency [ 52 ].
In addition, based on the visual attention idea, several object attention models
were successfully constructed to extract the object of interest in videos, such as
the facial saliency model [ 53 ] and focused saliency model [ 54 ]. Unlike the general
saliency model, object attention model is designed based on the prior knowledge
or the training procedure. For example, the first model given in [ 53 ] is proposed
to segment human face from the head-and-shoulder type video based on a facial
saliency map, which is defined as:
S
(
x
,
y
)=
P 1 (
x
,
y
) ·
P 2 (
x
,
y
) ·
P 3 (
x
,
y
) ,
(1.8)
where P 1 , P 2 ,and P 3 denote the “conspicuity maps” corresponding to the chromi-
nance, position, and luminance components, respectively. Each component utilizes
the knowledge of human face, such as the skin color that can be detected by the
presence of a certain range of chrominance values with narrow and consistent dis-
tribution in the YCbCr color space. An example of facial saliency map is shown in
Fig. 1.9 , where high saliency values usually correspond to the face regions.
In order to highlight the primary objects in an image, the attention object is usu-
ally shown in sharp focus, whereas background objects are typically blurred being
out-of-focus. The second saliency model [ 54 ] was proposed to extract focused ob-
jects automatically based on the matting model, which mainly consists of three
steps. The first step is to generate a re-blurred version of the input video image
by a point-spread function in the proposed method. The focused saliency map of
Fig. 1.9 An example of
facial saliency map. Left :
Original image claire . Right :
The facial saliency map
Search WWH ::




Custom Search