Combined and Comparative Metrics - Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics

Information Technology Reference

In-Depth Information

WATCH OUT FOR OUTLIERS

When transforming any data where you're letting observed values determine the minimum

or maximum (e.g., times or errors), you need to be particularly cautious about outliers. For

example, in the data shown in Table 8.6 , what if Participant #4 had made 20 errors instead

of 5? The net effect would have been that his transformed percentage would still have been

0% but all of the others would have been pushed much higher. One of the standard ways

of detecting outliers is by calculating the mean and standard deviation of all your data and

then considering any values more than twice or three times the standard deviation away

from the mean as outliers. (Most people use twice the standard deviation, but if you want

to be really conservative, use three times.) For the purpose of transforming data, those

outliers should be excluded. In this modified example, the mean plus twice the standard

deviation of the number of errors is 14.2, while the mean plus three times the standard

deviation is 19.5. By either criterion, you should treat 20 errors as an outlier and exclude it.

When transforming any usability metric to a percentage, the general rule is

to first determine the minimum and maximum values that the metric can possi-

bly have. In many cases this is easy; they are predefined by the conditions of the

usability test. Here are the various cases you might encounter.

•

Iftheminimumpossiblescoreis0andthemaximumpossiblescoreis

100 (e.g., a SUS score), then you've basically already got a percentage.

Just divide by 100 to make it a true percentage.

•

Inmanycases,theminimumis0andthemaximumisknown,suchas

the total number of tasks or the highest possible rating on a rating scale.

In that case, simply divide the score by the maximum to get the percent-

age. (This is why it's generally easier to code rating scales starting with 0

as the worst value.)

•

Insomecases,theminimumis0butthemaximumisnotknown,such

as the example of errors. In that situation, the maximum would need to

be defined by the data—the highest number of errors any participant

made. Specifically, the number of errors would be transformed by divid-

ing the number of errors obtained by the maximum number of errors

any participant made and subtracting that from 1.

•

Finally,insomecases,neitherminimumnormaximumpossiblescores

are predefined, as with time data. In this case, you can use your data to

determine the minimum and maximum values. Assuming higher val-

ues are worse, as is the case with time data, you would divide the differ-

ence between the highest value and the observed value by the difference

between the highest and the lowest values.

WHAT IF HIGHER NUMBERS ARE WORSE?

Although higher numbers are better in cases such as task success rates, in other cases

they're worse, such as time or errors. Higher numbers could also be worse in a rating

scale if it was defined that way (e.g., 0-6, where 0 = Very Easy and 6 = Very Difficult).

Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics

Search WWH ::

Custom Search

Home