Information Technology Reference
In-Depth Information
being investigated. If you wanted to conduct an evaluation of the usability of an
application that is used in an office environment, for example, you would need to
make sure that your setting resembled that of a normal office, where telephone
calls and conversations (work and non-work related) are both constant sources of
interruptions to task performance. The downside of having high ecological validity
is that you cannot control all the possible independent variables that may affect the
thing you are trying to measure.
At first glance there appears to be a conflict between internal and external (and
ecological) validity. The main reason for carrying out evaluations in a laboratory
setting is so that you can control for all interfering variables. Although this will
increase your internal validity you lose external and ecological validity because
you are using an artificial context for collecting data, and your results may not
generalize to the real world—you may lose external and/or ecological validity. If
you are carrying out an evaluation in a real world setting, however—using
observation, for example—you will have high external (and ecological) validity
but your internal validity will be reduced. Whether this is a problem or not depends
on your research strategy. If you are following an inductive research strategy, then
it is a problem because you will be concerned with the generalization of results; if
you are following a deductive strategy, to test a theory, for example, then it is not a
problem, because you are only concerned with threats to internal validity.
13.2.5.2 Reliability
Reliability refers to the ability of a measure to produce consistent results when the
same things are measured under different conditions. Usually this is used in the
context of test-retest reliability. In other words, if you conducted the same test again
under the same conditions, but on a different day or with a similar set of participants,
for example, you should get the same results if the measure is reliable. Reliability is
also used in the context of assessing coding schemes, particularly when you need to
encode the responses that you collect from users. If a coding scheme is reliable, then
when you give the scheme and the data to another person, they should code the same
data items in the same way. The level of agreement between the people who do the
coding is what is called the inter-rater reliability, and you can measure this sta-
tistically (Cohen's Kappa test is often used to calculate the results).
13.2.5.3 Sensitivity
Even if the selected measure is both valid and reliable, it may not be sensitive
enough to produce discernible effects that can easily be measured. The chosen
measure may not change very much when you change the independent variables,
for example. In this case it may be necessary to use a large number of participants.
To achieve results that are statistically significant, however, you will still need to
make sure that the measure has high reliability; otherwise your results will still be
open to question.
Search WWH ::




Custom Search