120 LEARNING ABOUT PARAMETERS, AND SOME NOTES ON PLANNING
5.2 Test statistics described by parameters
The principal task of statistics is to reduce data to something that is interpretable. We see
our data, which are observations of an outcome variable, as a representative sample from a
hypothetical infinite population in which the corresponding data are described by a distribu-
tion function. This distribution is specified up to a few unknown parameters. For example,
the ubiquitous Gaussian distribution depends on the parameter vector θ = (m, σ
2
), where m
is the mean and σ
2
the variance. For the purpose of this discussion we will denote a general
distribution function by F (x, θ), where θ may be a parameter vector, but will still be called
a parameter. The aim of the statistical analysis is to estimate this parameter using an esti-
mator, which itself has a distribution because its value varies with the sample we draw from
the population.
Since this will be a discussion about tests, our primary interest is not in the CDFs that
describe the population data, but in the CDFs of test statistics (which are summaries of the
sample data). In such a case the parameter θ may be multidimensional and contain one part
that is of interest (such as the odds ratio) but also some nuisance parameters. For example, if
we want to test for a mean difference (our parameter of interest) in Gaussian data, the original
model also contains a nuisance parameter, the variance. Such nuisance parameters cannot be
ignored; a way to handle them must be found (this is how the t distribution emerges: we use
a particular estimate for the variance, which replaces the original Gaussian distribution by a
t distribution). To simplify the present discussion we suppress any nuisance parameters from
the notation, so that the CDF F (x, θ) of the test statistic contains only one parameter. The
null hypothesis of the test we are interested is written θ = θ
0
, where θ
0
is a specified value
of the parameter, usually zero or one. This means that the distribution F(x) under the null
hypothesis is the same as F(x, θ
0
).
When it comes to the parameter θ, statistics addresses two different problems. The first is
what we actually know about θ, and the second is what is the best estimate of θ. These two
problems must not be confused: a small sample may produce an accurate estimate, but our
confidence in it may still be rather low. The two problems can be described as follows.
Hypothesis testing. We want a test of the hypothesis that θ = θ
0
. The confidence in the
rejection of this hypothesis is inferred from the p-values which we compute from the
distribution F (x, θ
0
). For our discussion we mostly assume that this test is one-sided
and that the p-value is computed from the value x
∗
of the test statistic X obtained in
the experiment as the probability P(X ≥ x
∗
|θ
0
), that is,
p = 1 − F (x
∗
−,θ
0
).
The different p-values based on this observation, as a function of θ
0
, are what constitute
the confidence function.
Parameter estimation. This is about finding the ‘best’ value for θ, based on the informa-
tion the data give us. One estimation method is the likelihood method, which addresses
the problem by asking, given the data we have, what is the most likely value of θ.
Mathematically this means that we want to find the θ that maximizes the probability
L(θ) for what we have observed. We do not use L(θ) as a true probability, and it is
therefore referred to as the likelihood function of θ. The estimation method is called the
maximum likelihood method. Though probably the most important estimation method,