Graphics Reference
In-Depth Information
￿
How quickly the agent takes its turn. We evaluate this by measuring
the length of the silence before each successful turn-transition
(from other to the agent) and compare the results to human data.
￿ Frequency of overlapping speech. Because the agent should be
learning to be polite—i.e. not speak on top of the other—the
number of overlaps should decrease over time. (Note: In our
Speaking-with-Self condition we use a closed sound loop (no
open mic), but an open mic setup when the system speaks with
humans.)
5.1 Hypotheses and statistics
To evaluate the learning mechanism, we used linear regression on the
single-case data sessions ( Artificial —talking to itself (a copy of itself
in the interviewee role) for 10 consecutive sessions with 30 questions
each; Single person —talking to one person for 10 consecutive sessions
with 30 questions each). For the 10-person condition (asking 10 different
people 30 questions each), we used within-subject t -tests between
the first five sessions and the second five sessions. In all cases the
dependent variables are: (a) Taking Turn in less than 500 msecs, (b)
Taking Turn in less than 300 msecs, and (c) Number of Overlaps.
The hypotheses are:
￿
H1: Frequency of taking turn within less than 500 msecs should
increase as a function of number of turns.
￿
H2: Frequency of taking turn within less than 300 msecs should
increase as a function of number of turns.
￿
H3: Frequency of overlapping speech should be higher in the first
half of the interviews than in the second half of the interviews.
5.2 Interview setup
The agent is configured to ask 30 pre-defined questions, using, among
other things, STW to control its turn-taking behavior during the
interlocutor's turn (see Figure 5). Each interaction takes approximately
five minutes. We have run three different evaluation conditions with
the system.
1. The system interviewing itself (“Artificial”). Having a single
artificial interlocutor interacting with a non-learning instance
of itself gives us a very consistent behavior in a setup with
no background noise, providing a baseline for the real-world
evaluations.
Search WWH ::




Custom Search