Graphics Reference
In-Depth Information
Castellano (2006) classified expressive gesture in human full-body
movement (music and dance performances) and in motor responses
of participants exposed to music stimuli: they identified parameters
deemed important for emotion recognition and showed how these
parameters could be tracked by automated recognition techniques.
Other studies show that expressive gesture analysis and classification
can be obtained by means of automatic image processing (Drosopoulos
et al., 2003; Balomenos et al., 2005) and that the integration of multiple
modalities (facial expressions and body movements) is successful for
multimodal emotion recognition (Gunes and Piccardi, 2005).
Several systems have been proposed in which visual feedback/
response is provided by analyzing some features of the users' behavior.
In such systems, the input data can be obtained from dedicated
hardware (joysticks, hand gloves, etc.), audio, and video sources.
SenToy (Paiva et al., 2003) is a doll with sensors in its arms, legs and
body. Several body positions of the doll are associated with emotional
states. According to how the users manipulate the doll, they can
influence the emotions of characters in a virtual game: depending on
the expressed emotions, the synthetic characters perform different
actions. Taylor et al. (2005) developed a system in which the reaction
of a virtual character is driven by the way in which the user plays a
music instrument. Kopp et al. (2003) designed a virtual agent able to
imitate natural gestures performed by humans using motion-tracked
data. When mimicking, the agent extracts and reproduces the essential
form features of the gesture stroke, which is the most important gesture
phase. Reidsma et al. (2006) designed a virtual rap dancer that invites
users to join him in a dancing activity. Users' dancing movements are
tracked by a video camera and guide the virtual rap dancer.
Castellano et al. (2007) investigate how emotional states can be
communicated through speech, face, and gesture both in a separate
and in a joint way. In particular, gestures are analyzed by tracking
user's hands and computing some meta-movement features. At
first, the user's body silhouette is extracted from the input video by
performing background subtraction. Then, hands are localized using
skin color tracking and their geometrical barycenter is determined.
The authors extract two types of indicators: movement cues and
features. Movement cues correspond to data computed directly from
points (hands' barycenter) moving on a 2D plane (the video frame),
like speed, acceleration, and fluidity. Movement features are meta-
indicators, that is, descriptors of the movement cues variation over
time, like initial and final slope, maximum value, number of peaks,
and so on. These cues are provided as input to a Bayesian classifier
Search WWH ::




Custom Search