Database Reference
In-Depth Information
We believe that the topic is so pervasive that it merits the readers' understanding
of the “philosophy” or “reasoning” that the topic entails. Indeed, hypothesis testing
could be considered “ The Statistical Analysis Process”—if only one topic deserved
that description in the entire ields of statistics, data analysis, and predictive analytics.
An example may help to illustrate the concept of hypothesis testing. Suppose that
you're working as a UX researcher at a company where the CEO wants to completely
change the design of the company's Web home page. Let's suspend reality a moment
and agree that we know , for the current Web home page, that the true mean (not the
“X-bar”) satisfaction rating is 4.10 on a scale of 1-5, where 1 = Not At All Satisied
to 5 = Extremely Satisied.
Recently, the design team has come up with a new home page design. So far, everyone
seems to like it, but you want some empirical evidence that the new design is indeed an
improvement over the current design in terms of mean satisfaction. You decide to run a 25
person survey, and probe on satisfaction of the new design. You calculate the new satisfac-
tion rating mean (the X-bar) for the new design from your sample of users and get a 4.15.
(The true/population mean of the new design is the one, and only one, unknown value.)
SIDEBAR: WHAT IF WE KNOW NEITHER TRUE MEAN?
In Chapter 2, we consider the more detailed, more frequent case, in which we do not know the true
mean for either design. One home page features a young waif sipping coffee at an outdoor cafe,
while another design shows a romantic couple meeting in front of the Eiffel Tower. These are two
competing designs in Chapter 2, and, as we noted, we don't know the true mean satisfaction rating
of either design. We'll explain how to deal with this scenario in Chapter 2.
Well, 4.15 is obviously higher than 4.10. So the new design wins, right? Or does
it? Well, it depends. The true mean for the new design might be 4.00 (notably under
the 4.10 of the current design), and just due to routine variability , these 25 people
happened to give you an X-bar for the new design of 4.15.
SIDEBAR: OMNIPRESENT VARIABILITY
As we've mentioned, variability almost always exists in a data set. Indeed, there is almost always
variability in results; in the UX world, the variability depends on who comprises the sample of
people. Indeed, when you lip 10 coins, you expect to get around 5 heads (perhaps 4 or 6, or, rarely,
3 or 7), but you might get 2 or 8 heads or a more lopsided result than that.
So, you need to decide—is an X-bar of 4.15 really enough above 4.10 to consti-
tute convincing evidence that the new design has a true higher mean satisfaction?
Well, since you're reading this topic, you're savvy enough now to say “probably
not!” (By the way, this is not the same as when X-bar = 4.8. That is a value way above
4.1, with, indeed, little disagreement among the 25 people. If the X-bar is 4.8 out of
5, there just can't be a lot of disagreement!).
 
Search WWH ::




Custom Search