Video Scene Analysis: A Machine Learning Perspective - Video Segmentation and Its Applications

Digital Signal Processing Reference

In-Depth Information

Fig. 4.7 Some representative results. Note that here the eye density maps are not convolved with

a Gaussian kernel, which is a popular method to recover more positive samples for the evaluation.

( a ) Original frames; ( b ) eye fixation maps; ( c )[ 24 ]; ( d )[ 19 ]; ( e )[ 20 ]; ( f )[ 16 ]; ( g )[ 13 ]; ( h )[ 15 ];

( i )[ 68 ]; ( j )[ 46 ]; ( k )[ 29 ]; ( l )[ 44 ]; ( m ) our approach

Fig. 4.8 Targets and distractors in different scenes can be best distinguished by different features.

( a ),( b ) the “motion” feature; ( c ),( d ) the “color” feature

4.3.3.4

Multi-Task Rank Learning for Visual Saliency Estimation in Video

Generally speaking, a unified ranking function derived with the proposed approach

can obtain impressive results in some cases but meanwhile may suffer poor per-

formance in other cases since they often construct a unified model for all scenes.

Actually, the features that can best distinguish targets from distractors may vary

remarkably in different scenes. In surveillance video, for instance, the motion fea-

tures can be used to efficiently pop-out a car or a walking person (as shown in

Fig. 4.8 a, b); while to distinguish a red apple/flower from its surroundings, color

contrasts should be used (as shown in Fig. 4.8 c, d). In most cases, it is infeasible to

pop-out the targets and suppress the distractors by using a fixed set of visual fea-

tures. Therefore, it is necessary to construct scene-specific models that adaptively

adopt different solutions for different scene categories.

Toward this end, we propose a multi-task rank learning approach for visual

saliency estimation. In this approach, visual saliency estimation is also formulated

as a pair-wise rank learning problem. However, this approach constructs multiple

visual saliency models, each for a scene cluster, by learning and integrating the fea-

tures that best distinguish targets from distractors in that cluster. We also propose

Search WWH ::

Custom Search

Home