
195
Chapter 11: Two-Sample Hypothesis Testing
t for Two
The example in the preceding section involves a situation you rarely
encounter — known population variances. If you know a population’s vari-
ance, you’re likely to know the population mean. If you know the mean, you
probably don’t have to perform hypothesis tests about it.
Not knowing the variances takes the Central Limit Theorem out of play. This
means that you can’t use the normal distribution as an approximation of the
sampling distribution of the difference between means. Instead, you use the
t-distribution, a family of distributions I introduce in Chapter 9 and apply to
one-sample hypothesis testing in Chapter 10. The members of this family of
distributions differ from one another in terms of a parameter called degrees
of freedom (df). Think of df as the denominator of the variance estimate you
use when you calculate a value of t as a test statistic. Another way to say
“calculate a value of t as a test statistic”: “Perform a t-test.”
Unknown population variances lead to two possibilities for hypothesis testing.
One possibility is that although the variances are unknown, you have reason to
assume they’re equal. The other possibility is that you cannot assume they’re
equal. In the subsections that follow, I discuss these possibilities.
Like peas in a pod: Equal variances
When you don’t know a population variance, you use the sample variance
to estimate it. If you have two samples, you average (sort of) the two sample
variances to arrive at the estimate.
Putting sample variances together to estimate a population variance is called
pooling. With two sample variances, here’s how you do it:
In this formula s
p
2
stands for the pooled estimate. Notice that the denomina-
tor of this estimate is (N
1
-1) + (N
2
-1). Is this the df? Absolutely!
The formula for calculating t is
17 454060-ch11.indd 19517 454060-ch11.indd 195 4/21/09 7:30:21 PM4/21/09 7:30:21 PM