Information Technology Reference
In-Depth Information
7.3.2
Computational Tools for Making Survey Research Scalable
Psychometric scales can be characterized as one of the most ubiquitous measure-
ment tools in the social sciences. However, they are not free of problems. The devel-
opment of a set of scales is often described as a three-step process: item generation,
scale development, and scale evaluation (Hinkin, 1995). The first step aims at en-
hancing the content validity of the questionnaire (i.e. that a complete coverage of the
domain of interest is obtained through the proposed items); the latter two steps aim
at enhancing the convergent and discriminant validity of the questionnaire (i.e. that
each item correlates highly with other items that attempt to measure the same latent
construct, and weakly with items that attempt to measure different latent constructs).
While the later two phases are supported by a wealth of statistical techniques, the
item generation phase (related to the content validity of the questionnaire) is still
regarded as a largely subjective procedure (Scandura and Williams, 2000; Hinkin,
1995; Larsen et al., 2008b). Questionnaire items are typically generated through
brainstorming with domain experts, or through empirical studies with participants
from the targeted audience, most often involving structured interviewing techniques
(c.f. Hinkin, 1995). A number of limitations can be identified in item generation
practices.
First, brainstorming practices require a firm understanding of the problem do-
main and clear definitions of the constructs to be measured. However, as Haynes
et al. (1995) noted, questionnaires are often grounded in contemporaneous theories
that evolve over time and that are supported by limited empirical data, thus, it is in-
evitable that early questionnaires fail to capture all possible facets of a construct and
consecutive iterations are required for an adequate measurement of the construct.
Second, domain experts in brainstorming item generation practices often resort
to lexical similarity (i.e. synonymy) when deriving new items in an effort to assure
high convergent validity of the proposed constructs. This may have substantial im-
plications to the rating process. Larsen et al. (2008b) found for the majority of con-
structs in a sample of questionnaires, the semantic similarity between items to be
a significant predictor of participants' ratings (.00
R 2
<
<
.63). In such cases, partic-
ipants are more likely to have employed shallow processing (Sanford et al., 2006),
i.e. responding to surface features of the language rather than attaching personal
relevance to the question.
Ideally, questionnaire items should be grounded on a large pool of empirical stud-
ies that are likely to have identified multiple facets of a given construct. Constructs
could then evolve as new knowledge is being added to the corpus. Such a practice
is however currently infeasible due to a lack of scalability across different empirical
studies, as the exact procedure of item generation, the full-list of generated items,
and their relationships are frequently not properly reported (Hinkin, 1995; Scandura
and Williams, 2000). A computational infrastructure such as the one sketched above
would enable social scientists in leveraging insights across different exploratory em-
pirical studies which would in turn lead to scalable scale development practices
within and across different research groups.
 
Search WWH ::




Custom Search