Information Technology Reference
In-Depth Information
In either case, t.test will compute a p -value. Conventionally, if p < 0.05, the means
are likely different, whereas p > 0.05 provides no such evidence:
• If either sample size is small, the populations must be normally distributed. Here,
“small” means fewer than 20 data points.
• If the two populations have the same variance, specify var.equal=TRUE to obtain a
less conservative test.
Discussion
I often use the t test to get a quick sense of the difference between two population
means. It requires that the samples be large enough—both samples have 20 or more
observations—or that the underlying populations be normally distributed. I don't take
the “normally distributed” part too literally. Being bell-shaped should be good enough.
A key distinction here is whether your data contains paired observations, since the
answer may cause your results to differ. Suppose we want to know if coffee in the
morning improves scores on SAT tests. We could run the experiment two ways:
• Randomly select one group of people. Give them the SAT test twice, once with
morning coffee and once without morning coffee. For each person, we will have
two SAT scores. These are paired observations .
• Randomly select two groups of people. One group has a cup of morning coffee and
takes the SAT test. The other group just takes the test. We have a score for each
person, but the scores are not paired in any way.
Statistically, these experiments are quite different. In experiment 1, there are two
observations for each person (caffeinated and decaf) and they are not statistically in-
dependent. In experiment 2, the data is independent.
If you have paired observations (experiment 1) and erroneously analyze them as un-
paired observations (experiment 2), then you could get this result with a p -value of
0.9867:
> t.test(x,y)
Welch Two Sample t-test
data: x and y
t = -0.0166, df = 198, p-value = 0.9867
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-30.56737 30.05605
sample estimates:
mean of x mean of y
501.2008 501.4565
 
Search WWH ::




Custom Search