Graphics Reference
In-Depth Information
his phrasing, and his timing, to keep the audience in maximum contact;
allowing them to parse and perceive the impact of each small chunk
before moving on to the next. He is talking not 'to', but 'with' his
audience, repeating chunks where necessary, and constantly testing
their comprehension to maintain their attention.
Our task is to provide machines with a similar faculty—to process
speech interactively, taking into account the cognition of the listener.
2. Even Dogs and Young Babies Can Do It!
The human is a socially organized animal, and we are unique among
animals in spending so much time rearing our young. Our infants are
helpless and dependent on a carer for longer than any other animal,
but in turn they spend long hours watching and (more often) listening
to the people around them, and consequently they learn the norms
of human behavioral patterns very early. They become familiar with
the patterns of speech sounds and rhythms of spoken interaction from
even before birth, as the sounds of the mother's speech are carried
into the womb to the hearing infant along with her blood (with its
varying adrenaline levels) as she goes about her daily conversational
activities (Karmiloff and Karmiloff-Smith, 2001).
The Hungarian ethologist Ádám Miklósi has shown that the feature
which most differentiates dogs from wolves, their nearest animal
neighbor, is that only the former have really learnt to watch and learn
from human behavior. In this way, dogs become companions to the
humans who host and care for them, while wolves lack the capacity
for such serendipitous coexistence (Miklósi, 2008). The capacity to
observe, to interpret behavior and also to empathize is perhaps what
underlies the mechanisms of companionship that are fundamental to
all social relationships.
A key feature of human communication is that we have learnt
to express propositional content alongside social information
simultaneously. From earliest times, we have watched our fellow
beings and learnt to read information about their cognitive states
from their behavior (Dunbar, 1998). We know whether or not they
are listening, and paying attention, and make continuous estimates
about their levels of comprehension as we speak. We unconsciously
structure our speech to facilitate this process.
Our speech processing technologies, however, presently lack any
such notion of empathy. They also lack the ability to observe the effects
of their actions on others. People consequently feel uneasy with much
of present speech technology, and this may be hampering its acceptance
Search WWH ::




Custom Search