Human–computer interaction - Pervasive Systems and Ubiquitous Computing

Information Technology Reference

In-Depth Information

is suitable for input analysis and to recognize hand motion. As a perceptive

processing tool, this method is aimed to transform percepts in a sub-symbolic

representation, which encodes an emotional status. In turn, this representation

provides the cognitive processing module with a suitable input.

At first, the normal flow field is computed from the input sequence. The

expectation maximization (EM) algorithm is used to fit a Gaussian to the

normal flow histogram computed across the frame. The moving arm is

detected as the dominant region in the normal flow field, that is, the set of all

the points whose normal flow value is ≥4 σ . Sample points are selected as the

ones with a large gradient value as well as a large normal flow value and

whose gradient is similar to their neighbours. The boundary of the arm is

obtained using the Dijkstra shortest path connecting all the sample points.

Then, affine transform parameters describing the arm global motion are

estimated from the analysis of the arm boundaries.

Symbolic information is obtain via a hierarchical clustering. LVQ is used

to compress affine parameter vectors and to derive a labelled Voronoi

tessellation where each tile corresponds to a motion primitive without a

precise meaning. The next layer clusters label sequences in sub-activities like

up , down and circle . Finally, a robust matching procedure based on nearest

neighbour classification, groups sub-activity sets into complex sequences like

striking , pounding , swirling (= repeated circle ) and so on.

Figures 2-4 show some steps of the approach explained earlier. Figure

2(a) shows an image taken from a 400-frames long pounding sequence, while

Figure 2(b) shows an image from a 100-frames long swirling sequence.

Images were captured by a progressive colour scan SONY DFW-VL500

camera with a frame rate of 30 frames per second, each frame being 320 ×

240 pixel wide.

Figure 3 (first row) shows the points with the maximum normal flow,

while the arm boundary is depicted in the second row.

Figure 4 (first row) shows the residual flow that is computed as the

difference between the normal motion field given by the affine parameters

and the normal flow. Figure 4 (second row) shows re-estimated affine flows

after outlier removal using again EM to fit a Gaussian distribution residual

flow.

3.1.2 Facial data elaboration

All the animals, and humans above all, use face as the main channel for non-

verbal communication. Expression is composed of several features like eye

movements, mouth and eyebrow position, configuration of facial muscles and

so on. All these signals are part of a perceptive model that is used as the basis

for understanding emotions in the user [11].

Search WWH ::

Custom Search

Home