Combined and Comparative Metrics - Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics

Information Technology Reference

In-Depth Information

2. In the cell to the right of the first time, enter the formula:

=

(

MAX(A:A)

−

A1 /(MAXA:A

) ()

−

MIN(A:A))

3. Copy this formula down as many rows as there are times to be transformed.

Table 8.3 also shows the average of these percentages for each of the par-

ticipants. If any one participant had completed all the tasks successfully in the

shortest average time and had given the product a perfect score on the subjec-

tive rating scales, that person's average would have been 100%. However, if any

one participant had failed to complete any of the tasks, had taken the longest

time per task, and had given the product the lowest possible score on the sub-

jective rating scales, that person's average would have been 0%. Of course, rarely

do you see either of those extremes. Like the sample data in Table 8.3 , most

participants fall between those two extremes. In this case, averages range from

a low of 28% (Participant 4) to a high of 85% (Participant 9), with an overall

average of 58%.

CALCULATING PERCENTAGES ACROSS ITERATIONS

OR DESIGNS

One of the valuable uses of this kind of overall score is in making comparisons across

iterations or releases of a product or across different designs. But it's important to do the

transformation across all of the data at once, not separately for each iteration or design.

This is particularly important for time data, where the times that you've collected are

determining the best and worst times. That selection of the best and worst times should

be done by looking across all of the conditions, iterations, or designs that you want to

compare.

So if you had to give an “overall score” to the product whose test results are

shown in Tables 8.2 and 8.3 , you could say it got 58% overall. Most people

wouldn't be too happy with 58%. Many years of grades from school have proba-

bly conditioned most of us to think of a percentage that low as a “failing grade.”

But you should also consider how accurate that percentage is. Because it's an

average based on individual scores from 10 different participants, you can con-

struct a confidence interval for that average, as explained in Chapter 2. The 90%

confidence interval in this case is ±11%, meaning that the confidence interval

extends from 47 to 69%. Running more participants would probably give you

a more accurate estimate of this value, whereas running fewer would probably

have made it less accurate.

One thing to be aware of is that when we averaged the three percentages

together (from task completion data, task time data, and subjective ratings),

we gave equal weight to each of those measures. In many cases, that's a per-

fectly reasonable thing to do, but sometimes the business goals of the product

Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics

Search WWH ::

Custom Search

Home