Multiview Image Segmentation and Video Tracking - Video Segmentation and Its Applications

Digital Signal Processing Reference

In-Depth Information

Independent Tracking Without Fusion

Independent tracking without fusion is to track the regions of object in multiview

video by segmentation of each frame in individual view.

Cigla et al. [ 3 ] presented a multiview video object segmentation algorithm by

integrating color, depth, and motion features. A region-based color segmentation

algorithm based on modified Normalized Cuts is firstly adopted to generate over-

segmented segments. Depth map is then estimated for subregions in the available

segmentation mask by region-wise planarity assumption. Multiview video segmen-

tation is extended from image segmentation by combining the color and depth with

additional optical flow information to provide the motion field.

Independent Tracking with Fusion

Independent tracking with fusion is to segment the tracks in each camera stream

and then project the tracks to another camera view or a common view (ground plane

[ 27 , 49 ], “plan-view” [ 40 ]), or collect the 2D local tracks from individual view to a

global 3D track [ 29 ] or central node [ 43 ].

A multiview segmentation and tracking system in cluttered scene with mul-

tiple people is presented in [ 27 ], which is named M2Tracker. Exploiting the

approximate object's shape and location prior helps the segmentation of each

view using Bayesian classifications. The region-based stereo algorithm is capable

of finding the 3D points inside the object. By combing evidences from different

camera pair and producing feet-region likelihood estimation on the ground plane,

globally optimum detection and tracking of object is attainable using Kalman filter.

Zhao et al. [ 49 ] presented a similar and realtime system that detects and tracks

object independently for each stereo camera, and integrate tracking results from all

camera pairs to a multi-camera tracker (McTracker), which track each object on the

ground plan. An object tracking framework based on dynamic Bayesian formulation

is reported in [ 40 ] to observe and track object on the plan-view map by combining

local appearance feature and stereo depth data.

Instead of projecting the multiple single-view tracks to a common view, com-

bining the tracked 2D object into a 3D tracking module is another strategy for

multiview data fusion. In [ 29 ], following the people detection using background

subtraction and human-template correlation, 2D objects are tracked separately in

each of camera by a graph matching. A 3D tracker is established using geometrical

consistency between 2D objects to estimate the 3D head position. For tracking large

numbers of tightly-spaced and rapid-moving objects, i.e., hundreds of flying bats, a

multiobject multi-camera tracking framework is proposed in [ 43 ]. It maintains the

sensor-level tracking in each view and single-view measurements send to a central

node for across-view data association and tracker fusion. The feedback from central

node is then used for adjusting sensor track with across-frame data association.

Search WWH ::

Custom Search

Home