Information Technology Reference
In-Depth Information
Fortunately, there is a way to take this into account. Binary success rates are
essentially proportions: the proportion of users who completed a given task suc-
cessfully. For example, if 5 of the 10 participants completed a task, the success
rate is 5/10 = 0.5. The appropriate way to calculate a confidence interval for a
proportion like this is to use a binomial confidence interval. Several methods
are available for calculating binomial confidence intervals, such as the Wald
Method and the Exact Method. But as Sauro and Lewis (2005) have shown,
many of those methods are too conservative or too liberal in their calculation of
the confidence interval when dealing with the small sample sizes we commonly
have in usability tests. They found that a modified version of the Wald Method,
called the Adjusted Wald, yielded the best results when calculating a confidence
interval for task success data.
CONFIDENCE INTERVAL CALCULATOR
Jeff Sauro has provided a very useful calculator for determining confidence intervals for
binary success on his website http://www.measuringusability.com/wald . By entering the
total number of people who attempted a given task and how many of them completed
it successfully, this tool will perform the Wald, Adjusted Wald, Exact, and Score
calculations of the confidence interval for the mean task completion rate automatically.
You can choose to calculate a 99, 95, or 90% confidence interval. If you really want to
calculate confidence intervals for binary success data yourself, the details are included on
our website.
If 4 out of 5 users completed a given task successfully, the Adjusted Wald
Method yields a 95% confidence interval for that task completion rate ranging
from 36 to 98%—a rather large range! However, if 16 out of 20 users completed
the task successfully (the same proportion), the Adjusted Wald Method yields a
95% confidence interval of 58 to 93%. If you really got carried away and ran a
usability test with 100 participants, of whom 80 completed the task successfully,
the 95% confidence interval would be 71 to 87%. As is almost always the case
with confidence intervals, larger sample sizes yield smaller (or more accurate)
intervals.
4.1.2 Levels of Success
Identifying levels of success is useful when there are reasonable shades of gray
associated with task success. The user receives some value from completing a task
partially. Think of it as partial credit on a homework assignment if you showed
your work, even though you got the wrong answer. For example, assume that a
user's task is to find the least expensive digital camera with at least 10 megapixel
resolution, at least 12× optical zoom, and weighing no more than 3 pounds.
What if the user found a camera that met most of those criteria but had a 10×
Search WWH ::




Custom Search