Selection of Visual Descriptors for the Purpose of Multi-camera Object Re-identification - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

In this chapter, the authors' approach to the aforementioned problem of object

re-identification is described. Section 12.2 presents the algorithms used for the low-

level video analysis which extracts moving objects from the individual camera

images. In Sect. 12.3 , various approaches for multi-camera object tracking are pre-

sented. The problem of object re-identification requires a selection of distinctive

image features which allow for matching two appearances of the same object in

separate cameras and, at the same time, ensure that two different objects will not

have similar feature sets. In order to achieve this goal, the most commonly used

visual features descriptors and methods of extracting them from the image are dis-

cussed in Sect. 12.4 . These descriptors are then evaluated in order to select a subset

of them optimal for the object re-identification in multiple cameras, as presented in

Sect. 12.5 . In the following Section, the selected descriptors are applied to the object

re-identification task. The authors propose to use a classifier which is trained on fea-

ture descriptors of a single object obtained from all its appearances in a single camera,

and then use it for comparison with the feature sets obtained from other cameras.

Four classification methods are described in Sect. 12.6 and the optimal solution is

selected. Finally, the results of experiments performed using the proposed framework

are presented and discussed in Sect. 12.7 .

12.2 Video Preprocessing

Generally, an image contains visual information at several levels of complexity,

depending on the application. From a single colour light detection, to analysis of

raster binary maps, colour photos of simple geometry, to understanding of complex

natural scenes, and finally tracking objects moving on a complex background [ 4 ].

Digital images, represented as an RBG pixel matrix, can be processed in many

dissimilar ways. Several methods treat each pixel independently of its surroundings

(contrast enhancement, gamma correction), while others require the pixel context—

colour values of neighbouring pixels (noise reduction, sharpening). Numerous meth-

ods of image filtration can be found in the literature [ 8 , 9 , 21 ], employing FFT, DCT

and wavelet transformations [ 42 ].

For the object re-identification framework described here, the most important

objects are these in motion, described with a particular colour distribution and a

texture, disturbed by the noise, and changing appearance over time due to the motion

and deformations of the object, and the camera motion in 3D space. In order to

obtain information on the movement, algorithms for object detection, recognition,

and tracking are applied [ 12 , 28 ]. The purpose of the object detection routine is

to select image pixels belonging to moving objects, and to extract image regions

representing individual objects. These regions will be later used for calculation of

important visual information on these objects. Usually, a background modelling

and subtraction approach is used for this task [ 45 ]. The background model composes

statistical profiles of the most probable values of background pixels, for determination

of what colour and brightness of pixel in the current frame should be treated as the

Search WWH ::

Custom Search

Home