Graphics Reference
In-Depth Information
6.3 Silence length
A study on human behavior by Wilson and Wilson (2005) measured
silences in face-to-face conversations where participants always had
something to say. They reported response time to be shorter than 500
msecs in 70% of turn-transitions and shorter than 200 msecs in 30%
of turn-transitions.
Our study was conducted over a relatively low-quality voice
connection (Skype) and not face-to-face, and thus allows only for voice
cues to communicate envelope feedback regarding turns. The studies
are compatible in the sense that our agent always has something to say,
while people might have to think a bit before they answer. Silences in
telephone conversations tend to average about 100 msecs longer than
in face-to-face conversations (Bosch et al., 2005), so we have measured
silences shorter than 300 msecs and shorter than 500 msecs.
Table 6.
Average silences for each condition.
Condition
Shorter than 500 msecs
Shorter than 300 msecs
Artifi cial
53.1%
32.2%
Single person
44.0%
16.3%
10 people
43.7%
8.4%
Our agent takes its turn in less than 500 mescs in 53.1% to 43.7% of
turns for our three conditions (see Table 6). This is the average for the
last nine interviews, eliminating the first interview due to the preset
STW of 1280 and 640 msecs, which would interfere with obtaining
results based on real-time interactions.
When looking at how silence length evolves during the series
of interviews, it is obvious that Askur (the interviewer) adapts
relatively quickly in the beginning in all cases. In the first session,
when the agent interviews a copy of itself in the interviewee role, it
is obviously interviewing the most consistent interviewee; the agent
gets constantly better with only minor lapses until it reaches about 70%
of silences shorter than 500 msecs and around 40% of silences shorter
than 300 msecs (see Figure 7). When interviewing a single person for
10 consecutive interviews, the system cannot learn as well as when
interviewing a copy of itself since there is more variation in behavior.
When interviewing 10 people, Askur has reached about 50% of
before-turn silences shorter than 500 msecs (see Figure 7), compared to
70% in the human-human comparison data. There are two distinct dips
in performance in interviews four and eight. These can be attributed
to differences in the prosody patterns used by participants (see Figure
Search WWH ::




Custom Search