Human Action Recognition Based on Tracking Features - Foundations on Natural and Artificial Computation

Information Technology Reference

In-Depth Information

Human Action Recognition Based on Tracking

Features

Javier Hernandez, Antonio S. Montemayor,

Juan Jose Pantrigo, and Angel Sanchez

Departamento de Ciencias de la Computacion

Universidad Rey Juan Carlos, C/Tulipan, s/n,

28933 Mostoles, Madrid, Spain

{ javier.hernandez,antonio.sanz,juanjose.pantrigo,angel.sanchez } @urjc.es

Abstract. Visual recognition of human actions in image sequences is

an active field of research. However, most recent published methods use

complex models and heuristics of the human body as well as to classify

their actions. Our approach follows a different strategy. It is based on

simple feature extraction from descriptors obtained from a visual track-

ing system. The tracking system is able to bring some useful information

like position and size of the subject at every time step of a sequence,

and in this paper we show that, the evolution of some of these features

is enough to classify an action in most of the cases.

1

Introduction

Human action recognition aims to understand patterns of human movement

from image sequences and classify those actions into known categories. This is a

relevant problem in computer vision since it has applications in video surveillance

and monitoring human-computer interactions, augmented reality, and so on [1].

Human actions consist of spatial-temporal patterns that are generated by a

complex and time varying non-linear dynamic system. A complete description

of the system requires enumeration of all the variables, their interdependencies,

equations controlling their evolution and a set of boundary conditions to be

satisfied by the system [9]. Usually, the processing of this description needs too

many computational resources becoming intractable in real time for most of the

cases.

An standard approach for human action recognition is to extract a set of

features from each image sequence and use it to train classifiers to perform

recognition. Using those properties a system can classify or approximate a model

and use this model to classify. Approaches can be grouped depending on the

image properties such as motion-based, shape-based, gradient-based, etc. [3].

Several features have been proposed in the literature. In Wang and Suter [1]

some features extracted from human silhouettes or from their distance transform

are classified using three different methods: Gaussian mixture models, matching

based with the Hausdorf distance and continuous hidden Markov models. Zhou

et al. [4] employed time-space human silhouettes that are transformed into low

Search WWH ::

Custom Search

Home