Civil Engineering Reference
In-Depth Information
23.4.1 Modality Affordances
Visual information presentation is heavily emphasized in current interface designs. In particular, inter-
face designers rely on foveal vision (as opposed to peripheral vision), which seems appropriate for the
presentation of complex graphics and, in general, for conveying large amounts of detailed information,
especially in the spatial domain. Visual information displays also allow for permanent presentation,
which affords delayed and prolonged attending.
Current interfaces make use of peripheral vision to a much lesser extent. While the two channels —
foveal and peripheral vision — do not, in the strict sense, represent separate modalities, they are associ-
ated with different options and constraints. Peripheral vision is well-suited for detecting motion,
luminance changes, and the appearance of new objects. However, in contrast to foveal vision, it does
not support the recognition of objects or details. Peripheral vision represents an early orientation
mechanism (McConkie, 1983) that can be utilized by designers to help operators attend to a relevant
location or critical information at the right time. One potential problem with peripheral visual feedback
is that the visual field changes dynamically in response to contextual factors. With increasing foveal
taskloading, for example, visual attention begins to focus on information in the center of a display at the
expense of information presented in peripheral vision — a phenomenon called “attentional narrowing.”
The auditory channel differs from vision along several dimensions. First, it is omnidirectional, thus
allowing for information to be picked up from any direction and, to some extent, in parallel with infor-
mation presented via other channels. Secondly, auditory information presentation is transient. This
potential limitation is compensated for by a longer short-term storage of auditory (as opposed to
visual) information so that it can be processed with some delay. Finally, since it is impossible for us
to “close our ears,” auditory displays tend to be intrusive and are therefore reserved for alerting functions.
The auditory channel shares a number of characteristics with haptic sensory information, which is cur-
rently underutilized in interface design. Most importantly, cues presented via these two modalities are
transient in nature. Also, like vision and hearing, touch allows for the concurrent presentation and
extraction of several dimensions such as frequency and amplitude in the case of vibrotactile cues.
Touch differs from vision and hearing in that it is capable of both sensing and acting on the environment.
In general, touch can serve a variety of purposes, including: (a) grasping and manipulating tools, (b)
object identification, (c) exploring the spatial layout of objects, (d) assessing texture, temperature,
weight, and other attributes of objects, (e) sensing vibrations, and (f) exploring spaces that are not acces-
sible to vision (Lederman and Browse, 1988).
23.4.2 Multimodal Interface Design to Date
Research in the area of multimodal interfaces has expanded rapidly during the past decade and has led to
the emergence of two groups of multimodal interfaces. The first group includes systems that support two
or more combined user input modes such as speech, pen, touch, manual gestures, gaze, and head and
body movements (e.g., Benoit and Le Goff, 1998; Cohen et al., 1997; Pentland, 1996; Stork and Hen-
necke, 1995; Turk and Robertson, 2000; Vo and Wood, 1996; Wang, 1995; Zhai et al., 1999). These
systems have been developed to support functions such as increased system accessibility for diverse
users, improved performance of recognition-based systems, and increased expressive power (see
Oviatt and Cohen, 2000). Applications include map-based navigation systems, medical systems for
mobile use in noisy environments, person identification systems for security purposes, and web-based
transaction systems (for an overview see Oviatt, 2002).
The second group of interfaces presents users with multimodal system output to enhance awareness of
their overall workspace and surroundings (so-called ambient displays — for example, MacIntyre et al.,
2001) and to support time-sharing and attention management in the context of human-human and
human-machine interactions (e.g., Ho et al., 2001; Nikolic and Sarter, 2001; Sklar and Sarter, 1999).
Multimodal output systems have been designed primarily for virtual-reality applications and for use
in a variety of high-risk data-rich event-driven domains such as future car cockpits (e.g., Means et al.,
Search WWH ::




Custom Search