Biomedical Engineering Reference
In-Depth Information
high-stakes standardized tests, where very high reliability is necessary to
make decisions about each individual's competence, more than 100 knowl-
edge questions (items) are routinely used within a knowledge domain. In
this situation, large numbers of items are required both to attain the high
reliability necessary to generate a small standard error of measurement and
to sample adequately a broad domain of knowledge. For ratings of perfor-
mance by expert judges, fewer items on a form may be necessary because
the attribute to be rated is often specific. For any particular measurement
situation, a measurement study can determine how many items are neces-
sary and which items should be deleted or modified to improve the per-
formance of the item set hypothesized to comprise a scale.
Improving Measurement with Items
We offer here several practical suggestions to minimize measurement errors
through attention to item design. We focus here on ratings and elicitations
of attitudes and beliefs because these applications arise frequently during
the evaluations that are the focus of this topic.
1. Make items specific. Perhaps the single most important way to improve
items is to make them as specific as possible. The more information the
respondents get from the item itself, about what exactly is being asked for
and what the response options mean, the greater is the consistency and thus
the reliability of the results. Consider a basic item that may be part of a
multi-item rating form (Figure 6.7A). As a first step toward specificity, the
item should offer a definition of the attribute to be rated, as shown in Figure
6.7B. The next step is to change the response categories from broad quali-
tative judgments to behavior or events that might be observed. As shown
in Figure 6.7C, we might change the logic of the responses by specifically
asking for the opinion as to how frequently the explanations were clear.
2. Match the logic of the response to that of the stem. This step is vitally
important. If the stem—the part of the item that elicits a response—
requests an estimate of a quantity, the response formats must offer a range
of reasonable quantities from which to choose. If the stem requests a
strength of belief, the response formats must offer an appropriate way to
express the strength of belief, such as the familiar “strongly agree” to
“strongly disagree” format.
3. Provide a range of semantically and logically distinct response options.
Be certain that the categories span the range of possible responses and do
not overlap. When response categories are given as quantitative ranges,
novice item developers often overlap the edges of the response ranges, as
in the following example.
Search WWH ::




Custom Search