Information Technology Reference
In-Depth Information
are typically divided into two broad categories based on the field of view (FOV)
of each camera: common FOV methods [1][2] where cameras' FOVs largely over-
lap, and disjoint FOV methods [4][5][6] where a camera “hands-off” the tracking
of an object from the FOV of one camera to another camera. Traditional track-
ing methods such as Kalman filters are not appropriate when the topology of
the camera network is unknown and cameras are uncalibrated [4].
One of the classic problems in multi-camera tracking over either overlapping
or disjoint FOVs is the entry/exit problem, i.e., given that an object has left a
FOV at a particular location, which camera is most likely to see the object next,
where within that camera's FOV, and when? One solution to this problem was
presented by Javed et al. in [7]. Visual characteristics of objects were first used
to determine corresponding objects in different FOVs. Entry and exit points
in each camera's FOV were then determined using kernel density estimation.
Finally, optimal correspondences entry and exit points were determined using
a maximum a posteriori (MAP) approach based on a bipartite graph. Javed's
method works well with independent FOV scenarios without any inter-camera
calibration. However, it is restricted by the following:
1. a training phase must be available where correspondences between tracks
are known;
2. the entire set of observations must be available so hence, the method cannot
be deployed for real-time applications; and
3. the changes in visual characteristics of objects between camera views are
assumed to happen in the same, generally predictable way.
In this paper, we present a unified framework to solve the multi-camera tracking
problem in both independent FOV and common FOV cases. We assume that
objects have been independently tracked in each camera in a multi-camera net-
work, as in [7], and then aim to determine correspondences between these tracks
in a decentralised way, that is, without a centralised server. As in [7], our ap-
proach requires no camera calibration, or knowledge of the relative positions of
cameras and entry and exit locations of objects.
In contrast to [7], we remove each of the constraints listed earlier. We use a
kernel-based tracking algorithm, which creates kernels over the entire FOV of
each camera rather than only at entry and exit points. Our system effectively
performs unsupervised, online learning of a correspondence model by continuous
collection and updating of tracking statistics. This allows the proposed algorithm
to be performed in real-time with no need for a dedicated and supervised training
phase, thereby lifting constraints 1 and 2. To enable this collection of tracking
statistics we introduce the concept of reusing kernels, and show that by using
this technique the memory usage of the system is bounded. We then introduce
a location-based kernel matching method to address abrupt changes in visual
characteristics of objects (often due to changes in object pose or camera angle)
based on the historical data available through reusing kernels, thereby lifting con-
straint 3. This enables us to develop a lightweight, decentralised, multi-camera
tracking solution with limited communication between cameras ensuring that an
on-camera implementation is possible without requiring a coordinating server.
 
Search WWH ::




Custom Search