Information Technology Reference
In-Depth Information
Is there money to be made by carefully matching voice and words? Yes. People who heard the
recorded voice user interface bid more when they heard “I,” while people who heard the synthetic
voice interface bid more when they did not hear “I.”
When Not to Use “I” with Recorded Speech
Although recorded speech clearly benefits voice user interfaces—even though there are nuances
that must be taken into account (see Nass and Brave, 2005)—and people expect these systems to
use the term “I,” designers should not assume that using “I” is always optimal. For example, when
formality is desirable, the avoidance of “I” is effective (Nass and Brave, 2005).
A second domain in which the avoidance of “I” may be useful is when the system wants
to deflect blame from itself. Every child's instinct is to say “the lamp broke” rather than “I broke the
lamp,” and in the heat of the Watergate scandal, President Richard Nixon said “Mistakes were
made” rather than “I made mistakes.” Similarly, when a person requests information that may not be
provided, the system might benefit from behaving like a stereotypical bureaucrat by saying “The
rules do not permit that information to be given” rather than “I cannot give you that information
because of the rules.” This strategy can also be useful when the system has to deliver bad news, for
example, “That item is not in stock” rather than “I don't have that item right now.”
Passive voice can also be useful when a voice input system fails to understand the user (Nass
and Brave, 2005). Thus, an (actual) airline system that only uses “I” when it does not understand
the user is particularly poorly designed: The exceptional use of “I” draws attention to the personal
aspect of the interface at precisely the time when users are most frustrated and annoyed.
In a related way, cultural differences dictate when one should use “I” or “we.” In individualis-
tic cultures, such as the United States and Germany, people are more persuaded when the speaker,
including a computer agent, uses “I.” Indeed, in the United States (and likely other individualistic
cultures), individuals highlighting their own identity are evaluated more favorably than are the
same individuals in aggregates or groups (Sears, 1983). However, in collectivist cultures, includ-
ing most of Asia, it is much more effective to refer to “we” (Maldonado and Hayes-Roth, 2004;
Miller et al., 2001).
A third domain in which “I” may be problematic is when the user must provide input via
touch-tone (DTMF). There is a basic conversational principle that it is polite to respond in the
same modality that the user uses. People return a phone call with a phone call, not e-mail; a letter
with a letter, rather than a phone call; and a spoken yes/no question with words, rather than a nod
of the head. When a voice interface says “I” and then proceeds to refuse to let the person reply by
voice, this might be seen as controlling and unfair: “He/she gets to speak, but I only get to push
buttons?!” The avoidance of “I” may reduce the social presence (Lee, 2004) of the system and
thereby make it more acceptable to restrict the user to touch-tone responses.
The absence of “I” can be a powerful rhetorical technique when the system wants the user
to respond to the system's statements as certainties. For example, a voice user interface that says,
“I have four messages for you” or “I see that you are free between 12 and 2 PM,” or “I think that you
will like these four restaurants” may seem more uncertain than a system that says, “There are four
messages,” “You are free between 12 and 2 PM,” or “You will like these four restaurants.” Con-
versely, voice user interfaces that want full focus on themselves and their unique capabilities likely
should use the term “I,” as in “I have searched through thousands of songs to find these three for
you” as opposed to “Thousands of songs have been searched; here are three for you” (Nass and
Brave, 2005).
Search WWH ::




Custom Search