Image Processing Reference
One of the central questions in video processing is how to follow an object over
time. Imagine you are designing a game where the position of the hand is used
as a controller. What you need from your video processing software is then the
position of the hand in each image and hence a temporal sequence of coordinates,
see Table 9.1 .
We can also illustrate this as a number of points in a coordinate system. If we
connect these points we will have a curve through time, see Fig. 9.1 . This curve is
denoted the trajectory of the object.
The notion of a trajectory is not limited to the position of the object. We can
generalize the concept and say that the object is represented by a so-called state
vector , where each entry in the vector contains the value of a certain parameter at a
particular time step. Besides position, such entries could be velocity, acceleration,
size, shape, color etc. Formally we define tracking to be a matter of finding the
trajectory of an object's state. This chapter will define a framework for tracking,
namely the so-called predict-match-update framework, see Fig. 9.6 . Without loss of
generality we will below assume the state is only the position of the object, meaning
that the state vector we seek to find is s(t) =[ x(t),y(t) ]
. Below the framework is
built up one block at a time.
We can use some of the methods described previously in the topic to detect an
object. If we do this in each image and simply concatenate the positions we could
argue that we are doing tracking. This approach is, however, not considered tracking
since each detection is done independently of all other detections, i.e., no temporal
information is included.
The most simple form of tracking is when the estimated position is updated using
previous states. The current and previous states are combined in order to smooth the
current state. The need for this is motivated by the fact that noise will always appear
in the estimated position.