Direction in ND, Motion as Direction - Vision with Direction

Image Processing Reference

In-Depth Information

We illustrate the signal discretization effects of translational motion by Fig. 12.4,

where a 1D pattern, containing the full range of the frequencies, is translated with

zero, 0.67, and 2 pixels per frame. In the color images, the vertical axis is the time

axis. The image is actually a gray image, see Fig. 8.8, but to illustrate the translation

effectively, every gray tone has been replaced by a unique color. The corresponding

schematic FTs of the signals are shown besides, with the red lines representing the

nonzero power. For small motions there is no aliasing wrap-around, because the tem-

poral sampling frequency is sufficient to discretize the motion faithfully. However,

when the motion magnitude is increased above 1 pixel/frame, the sampling rate be-

comes too low for a satisfactory representation. The result is a distortion (aliasing),

or frequency wrap-around caused by the repetition of the Nyquist square in the k t

direction. Had the spatio-temporal signal been oversampled with a factor of 2 in the

x , as well as in the t directions, the extra red lines, caused by the repetition in the

k t direction, would not appear. This is because the signal would then be confined

to the cyan-colored zone shown in the figure. In conclusion, the motion sequences

in the spatial as well as temporal domains must be oversampled with a factor 2, to

avoid sampling aliasing caused by high spatial frequency contents (quick variations)

moving fast. Independently, this is also what is required if the structure tensor is to

be used to compute the velocity.

12.8 Affine Motion by the Structure Tensor in

7

D

Whereas a translational model is often adequate to describe the motion in a local

image, it becomes sometimes insufficient to describe complex motion when the im-

age patch to be analyzed is enlarged and thereby generally the complexity of the

motion is increased. A more elaborate model such as an affine motion model will

generally be better placed to describe the motion of a rigid object, particularly if the

field of view is small enough [132, 141]. Here we only treat the case when all image

points move according to the same affine motion model, parameters whose need to

be identified.

As before, we assume that BCC is valid, meaning that the spatio-temporal image

is generated from one frame of the image, i.e., the brightness distribution originating

from a certain instant. The generation of the rest of the spatio-temporal image will

be done by applying the affine coordinate transformation to the spatial coordinates,

s =( x, y ) T :

s ∗ = s + δt [ A 0 s + v 0 ]

⇒

f ( x, y, t )= g ( s + δt [ A 0 s + v 0 ])(12.81)

Consequently, the affine model is characterized by a velocity field v expressed in the

parametric form:

(12.82)

with v 0 , A 0 being a 2D vector and 2 × 2 invertible matrix, respectively. There are two

real parameters in the constant translation v 0 , and four in the matrix A 0 , represent-

ing rotation, scaling, and the (two) shearing deformations, totaling to six degrees of

v ( s )= A 0 s + v 0

Vision with Direction

Search WWH ::

Custom Search

Home