A Motion Descriptor Based on Statistics of Optical Flow Orientations for Action Classification in Video-Surveillance - Multimedia and Signal Processing

Digital Signal Processing Reference

In-Depth Information

need an accurate silhouette segmentation, but they are limited in scenarios with com-

plex background, illumination changes, noise, and obviously, moving camera.

On the other hand, the GMDs are commonly used in surveillance applications to

detect abnormal movements or to characterize human activities by computing relevant

features that highlight and summarize motion. For example, 3d spatio temporal Haar

features have been proposed to build volumetric descriptors in pedestrian applications

[6]. GMDs are also frequently based on the apparent motion field (optical flow), fully

justified because it is relatively independent of the visual appearance. For instance,

Ikizler et al [3] used histograms of orientations of (block-based) optical flow com-

bined with contour orientations. This method can distinguish simple periodic actions

but its temporal integration is too limited to address more complex activities. Guan-

gyu et al [5] use dense optical flow and histogram descriptors but their representation

based on human-centric spatial pattern variations limits their approach to specific

applications. Chaudhry et al [4] proposed histograms of oriented optical Flow

(HOOF) to describe human activities. Our descriptor for instantaneous velocity field

is very close from the HOOF descriptor, with significant differences that will be hig-

hlighted later, and the temporal part of their descriptor is based on time series of

HOOFs, which is very different from our approach.

The main contribution of this work is a motion descriptor which is both entirely

based on dense optical flow information and usable for recognition of actions or

events occurring in surveillance video sequences. The instantaneous movement in-

formation, represented by the optical flow field at every frame, is summarized by

orientation histograms, weighted by the norm of the velocity. The temporal sequence

of orientation histograms is characterized at every histogram bin as some temporal

statistics computed during the sequence. The resultant motion descriptor achieves a

compact human activity description, which is used as the input of a SVM binary clas-

sifier. Evaluation is performed with the Weizmann [8] dataset, from which 10 natural

actions are picked, and also with the ViSOR video-surveillance dataset [9], from

which 5 different activities are used. This paper is organized as follows: Section 2

introduces the proposed descriptor, section 3 demonstrates the effectiveness of the

method and the last section concludes with a discussion and possible future works.

2

The Proposed Approach

The method is summarized on Figure 1. It starts by computing a dense optical flow

using the local jet feature space approach [10]. The dense optical flow allows to seg-

ment the region with more coherent motion in a RoI. A motion orientation histogram

is then calculated, using typically 32 directions. Every direction count is weighted by

the norm of the flow vector, so an important motion direction can be due to many

vectors or to vectors with large norms. Finally, the motion descriptor groups up the

characteristics of each direction by simple statistics on the temporal series, whose

purpose is to capture the motion nature.

Search WWH ::

Custom Search

Home