Tracking - Introduction to Video and Image Processing

Image Processing Reference

In-Depth Information

9

Tracking

One of the central questions in video processing is how to follow an object over

time. Imagine you are designing a game where the position of the hand is used

as a controller. What you need from your video processing software is then the

position of the hand in each image and hence a temporal sequence of coordinates,

see Table 9.1 .

We can also illustrate this as a number of points in a coordinate system. If we

connect these points we will have a curve through time, see Fig. 9.1 . This curve is

denoted the trajectory of the object.

The notion of a trajectory is not limited to the position of the object. We can

generalize the concept and say that the object is represented by a so-called state

vector , where each entry in the vector contains the value of a certain parameter at a

particular time step. Besides position, such entries could be velocity, acceleration,

size, shape, color etc. Formally we define tracking to be a matter of finding the

trajectory of an object's state. This chapter will define a framework for tracking,

namely the so-called predict-match-update framework, see Fig. 9.6 . Without loss of

generality we will below assume the state is only the position of the object, meaning

that the state vector we seek to find is s(t) =[ x(t),y(t) ]

. Below the framework is

built up one block at a time.

9.1

Tracking-by-Detection

We can use some of the methods described previously in the topic to detect an

object. If we do this in each image and simply concatenate the positions we could

argue that we are doing tracking. This approach is, however, not considered tracking

since each detection is done independently of all other detections, i.e., no temporal

information is included.

The most simple form of tracking is when the estimated position is updated using

previous states. The current and previous states are combined in order to smooth the

current state. The need for this is motivated by the fact that noise will always appear

in the estimated position.

Search WWH ::

Custom Search

Home