Civil Engineering Reference
In-Depth Information
Paired Comparison Technique, and the Likert Scale (Sinclair, 1995). The simple rating method consists of
a set of questions related to workload or task attributes, and a scale on which those questions are rated.
The rating scale is represented by the 100 mm line, usually with subdivisions and labels at the end of the
scale. The intermediate labels or numbers are often assigned to some or all subdivisions. Thurstone's
Paired Comparison Technique asks the subject to compare two entities (e.g., two tasks) or every combi-
nation of all measured entities, and decide which one is larger or smaller. Each pair of compared entities
can be also evaluated on the rating scale with assigned numbers. The Likert Scale has an odd number of
discrete options and consisting of a range from 1 to 5 (or 7), with labels from “strongly disagree” to
“strongly agree,” respectively.
There are several common approaches applied in subjective workload measurements (Tsang and
Vidulich, 1994): (1) unidimensional versus multidimensional rating scale; (2) immediate versus retro-
spective assessment procedure; and (3) absolute estimation versus relative judgments approach. Unidi-
mensional scale assesses only one dimension of workload or focuses on the overall workload level, while
multidimensional scale evaluates several aspects or components of workload. The ratings can be obtained
immediately after performance of each condition or retrospectively after experiencing all task conditions.
The simple or direct rating approach is called the absolute magnitude estimation, in contrast to the rela-
tive judgment approach where as in Thurstone's Paired Comparison Technique, the subject is asked to
assess the task condition in reference to the single standard or multiple task conditions.
Each subjective method should demonstrate several properties in order to be accepted as good
measurement tool. These properties are: validity, reliability, sensitivity, diagnosticity, intrusiveness, trans-
ferability, and ease of implementation and subject acceptability (Eggemeier et al., 1991; Wierwille and
Eggemeier, 1993). These criteria should be applied when the suitability of workload assessment
method is considered for evaluation of a particular type of workload, or for a particular type of
environment, task, or system.
Reliability is the degree of precision to which a method or instrument is able to measure what it
measures. Reliability can be assessed as homogeneity, consistency or stability of measurement, or in
the case of two or more raters as the interrater reliability (ISO
FDIS 10075-3, 2002).
Validity is the degree to which a method or instrument is able to measure what it is intended to
measure (ISO
/
FDIS 10075-3, 2002). Thus, in case of mental workload, the measures should reflect differ-
ences in cognitive demands, but not changes in other variables such as physical workload, which are not
necessarily associated with mental workload.
Sensitivity refers to how well a technique can distinguish the differences in levels of load required to
accomplish a task (Wierwille and Eggemeier, 1993). With regard to workload, this criterion is primary
among other criteria, since it is important to access differences in the workload imposed by a task or
system.
Diagnosticity is the ability to distinguish the type of workload, or ability to attribute it to a particular
aspect of a performed task (Tsang and Wilson, 1997). Thus, the diagnosticity can help to determine
which elements or aspects of the task caused workload.
Intrusiveness is related to the fact that application of measurement technique can interfere with
the task performance, which is evaluated, and can cause performance changes that are not related to
the task itself (Wierwille and Eggemeier, 1993). In order to avoid contamination of workload measures,
it is desirable to minimize intrusiveness of the measurement method.
Transferability refers to the possibility of applying a given technique in different environments or tasks.
Implementation requirements related to such issues are: ease of data collection, robustness of the
measurement instruments, and overall data quality control (Wierwille and Eggemeier, 1993). Finally,
subject acceptability refers to the subject's perception of the measurement procedure.
In reference to criteria presented earlier, the subjective workload methods are described as sensitive to
different levels of the workload. However, since many research studies showed rather low diagnosticity of
the measures, it was concluded that the subjective rating scales can be used only as global measure of
workload (Eggemeier and Wilson, 1991). There is some evidence that the multidimensional workload
rating scales present better diagnostic properties (Tsang, 2001). Subjective measures appear to be reliable
/
Search WWH ::




Custom Search