SUNAR Surveillance Network Augmented by Retrieval - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

4.1 Metadata Cleaning

The preprocessed data is supposed to be incomplete or duplicate, biased and

noisy. Thus, moving objects are modeled as dynamic systems in which the

Kalman filter optimally minimizes the mean of error [5] and it can fill in the

missing information (position and velocity) for a few seconds in case the object

has been occluded, for instance 1 .

At the cleaning step, SUNAR stores metadata representing moving objects

and information about the environment under surveillance.

4.2

Indexing and Storing

The database model consists of three database schemes in the SUNAR database -

Process, Training and Evaluation according to their purpose. All schemes contain

three main tables that correspond to the fundamental concepts - Object, Track

and State (as in our former work [5]). Object is an abstract representation of a

real object (having a globally unique ID), it is represented by its states. A state

consists of two types of features - visual properties (as described in section 3) and

spatio-temporal features. The latter are represented by location and velocity of

an object at a moment. A track is a sequence of such states in a spatio-temporal

subspace of the area under surveillance followed by one camera.

The training scheme contains also tables containing statistics and classifica-

tion models according to the method used. For instance, a simplified Bayesian

model table contains columns for source and destination camera IDs, in which

objects are passing through. Next columns represent the number of training

samples, a prior probability, averages and variances of handover time, trajectory

states and visual features. Trajectories are summarized as a weighted average of

cleaned states, where the weight is highest at the end of the trajectory. If cameras

are overlapping, the handover time may be negative. The average and variance of

different feature descriptors acts as the visual bias removal (illlumination, color,

viewpoint and blob size calibration) for the integration step.

4.3 Multiple Camera Integration

The training schema described before is rather simplified. In fact, we use Gaus-

sian Mixture Model and Support Vector Machine [14,8] models of the (inverted)

Kalman filter state as described in our previous work [5]. The inverted state is

computed using Kalman filter in the opposite direction the object moved through

one camera subspace followed by one camera. The goal of this trick is the clas-

sification of the previous subspace (camera) in which it was seen last time most

probably.

The object identification then maximizes the (prior) probability of a previ-

ous location (camera) multiplied by the normalized similarity (feature distance

without bias) to previously identified objects according to average time con-

straints and visual features in the database [5,10]. More formally an optimal

1 Available at www.fit.vutbr.cz/research/view_product.php.en?id=53

Search WWH ::

Custom Search

Home