Database Reference
In-Depth Information
4.8 ADDENDUM 2: COMPARING MORE THAN TWO
TREATMENTS
Now, we illustrate another case where we have a task and there are more than two
search engines under investigation. Statisticians would call this comparing more than
two “treatments.”
Suppose that we have a third search engine under examination, the Microhard
search engine introduced in the previous section. And, we have another 10 people
trying task 1 using that search engine, with the result of eight successful completions.
If we use the letter “p1” to stand for the true completion rate for task 1, and “N” for
Novix, B for Behemoth, and M for Microhard, we would have a null hypothesis of
H0:p1N=p1B=p1M
vs.H1: not all three values of p1 are the same
(As a reminder: Task 1 is “Find a Java developer with at least 5 years' experience
within 50 miles of Tucson, Arizona.”) As another reminder—and we apologize for
the repetitiveness of this one—the true successful completion values are those that
would result if all people who could ever be performing the task with that search
engine indeed performed the task (successfully or not). We can sum up the sample
results in Table 4.8 .
As we see in Table 4.8 , N had three successful completions and seven failures,
B had nine successful completions and one failure, while M had eight successful
completions and two failures.
4.8.1 EXCEL
To do the chi-square test of independence in Excel, given a table such as Table 4.8 ,
you ind the table of theoretical frequencies the way you did earlier, arriving at
Table 4.9 .
The remaining steps are identical to those performed in Excel when there were
only two search engines. You would start with Figure 4.9 .
For the observed frequency table, the range is viewed as E4 to G5. Similarly, the
range for the theoretical expected frequencies is E10 to G11.
Table 4.8 Observed Frequencies for Hypothesis Test
Task 1
Search Eng:
N
B
M
Pass
3
9
8
Fail
7
1
2
 
Search WWH ::




Custom Search