Database Reference
In-Depth Information
a distance score. The distance scores for all the reference templates are set to a
decision rule, which provides a classification of the input gesture, and possibly an
ordered (by distance) set of the best n candidates.
11.3
Spherical Self-organizing Map (SSOM)
Prior to recognition, the system discussed in Fig. 11.3 creates the gesture reference
templates using a training algorithm. This is to first automatically parse samples
from across the spectrum of expected dance movements, into a discrete set of
postures. This is achieved using SSOM, an unsupervised clustering algorithm that
reduces a large number of input data vectors to a small set of prototypical units. The
SSOM enables learned postures to be allocated to, and distributed across, nodes on
a predefined lattice [ 344 , 348 ]. This results from the wrap-around, neighbourhood
learning that occurs when the lattice forms a closed loop sphere. A useful feature
of a SSOM-based approach is that the discrete space is constructed in such a way
as to retain associations that exist in the original input space, i.e. postures (learned)
are positioned in the map nearby to other postures that are very similar in nature.
As a consequence of this topology-preserving mapping, a sequence of postures
(comprised in the movement or gesture) should trace a rather smooth trajectory on
the map. It is from this trajectory (sequence of key postures) that the descriptors are
acquired for representing each gesture.
The map's spherical lattice is constructed by progressively sub-dividing a regular
icosahedron down to a desired level ( l ). This results in a series of nodes uniformly
arranged on a tessellated unit sphere (with uniform triangular elements). A sphere
tessellated one level
would
each result in lattices of 42 and 162 nodes respectively. Each node on the sphere is
then represented by a weight vector: w i , j , k R
(
l
=
1
)
would result in 12 nodes, while
(
l
=
2
)
and
(
l
=
3
)
D , which models a key posture from
the input space, where w i , j , k is the weight vector of
th node. The total number
of nodes represents the number of postures that can be learned by the map. In this
representation, nodes are each equidistant from their immediate neighbours, with
which they form a hexagonal neighbourhood.
Figure 11.4 shows a cluster unit of the SSOM. Each training pattern in the input
space is connected to every cluster unit by a weight vector w i , j , k . Every cluster unit
at
(
i
,
j
,
k
)
has a variable neighborhood ( NE i , j , k ) with a decreasing radius. All the
nodes that fall within the area defined by NE i , j , k constitute the region-of-influence
of
(
i
,
j
,
k
)
(
i
,
j
,
k
)
.
D . Each vector x is referred to as
a posture vector in a dance gesture. The learning process of the SSOM starts by ini-
tializing the weight vectors w i , j , k with small random values distributed throughout
the input space. Various steps are employed by the SSOM to topologically reorder
the cluster weights on the spherical lattice, as follows [ 344 , 348 ]:
i
Let
T = {
x i }
1 be the training set, where x
R
=
Search WWH ::




Custom Search