
precisely what led to the notion of P-value as a way of reporting significance
without imposing a particular a on others who might wish to draw their own
conclusions.
Even if a P-value is included in a summary of results, however, there may be
difficulty in interpreting this value and in making a decision. This is because a
small P-value, which would ordinarily indicate statistical significance in that it
would strongly suggest rejection of H
0
in favor of H
a
, may be the result of a large
sample size in combination with a departure from H
0
that has little pract ical
significance. In many experimental situations, only departures from H
0
of large
magnitude would be worthy of detection, whereas a small departure from H
0
would
have little practical significance.
Consider as an example testing H
0
: m ¼ 100 versus H
a
: m > 100 where m is
the mean of a normal population with s ¼ 10. Suppose a true value of m ¼ 101
would not represent a serious departure from H
0
in the sense that not rejecting H
0
when m ¼ 101 would be a relatively inexpensive error. For a reasonably large
sample size n, this m would lead to an
x value near 101, so we would not want this
sample evidence to argue strongly for rejecti on of H
0
when x ¼ 101 is observed.
For various sample sizes, Table 9.1 records both the P-value when
x ¼ 101 and also
the probability of not rejecting H
0
at level .01 when m ¼ 101.
The second column in Table 9.1 shows that even for moderately large sample
sizes, the P-value of
x ¼ 101 argues very strongly for rejection of H
0
, whe reas
the observed
x itself suggests that in practical terms the true value of m differs little
from the null value m
0
¼ 100. The third column points out that even when there is
little practical difference between the true m and the null value, for a fixed level of
significance a large sample size will almost always lead to rejection of the null
hypothesis at that level. To summarize, one must be especially careful in interpret-
ing evidence when the sam ple size is large, since any small departure from H
0
will
almost surely be detected by a test, yet such a departure may have little practical
significance.
Best Tests for Simple Hypotheses
The test procedures present ed thus far are (hopefully) intuitively reasonable, but
have not been shown to be best in any sense. How can an optimal test be obtained,
one for which the type II error probability is as small as possible, subject to
controlling the type I error probability at the desired level? Our starting point
here will be a rather unrealistic situation from a practical viewpoint: testing a
simple null hypothesis against a simple alternative hypothesis. A simple hypothesis
is one which, when true, completely specifies the distribution of the sample X
i
’s.
Suppose, for example, that the X
i
’s form a random sample from an exponential
distribution with parameter l. Then the hypothesis H: l ¼ 1 is simple, since when
H is true each X
i
has an exponential distribution with parameter l ¼ 1. We might
then consider H
0
: l ¼ 1 versus H
a
: l ¼ 2, both of which are simple hypotheses.
The hypothesis H: l 1 is not simple, because whe n H is true, the distribution of
each X
i
might be exponential with l ¼ 1 or with l ¼ .8 or .... Similarly, if the X
i
’s
constitute a random sample from a normal distribution with known s, then
H: m ¼ 100 is a simple hypothesis. But if the value of s is unknown, this hypothesis
is not simple because the distribution of each X
i
is then not completely specified; it
could be normal with m ¼ 100 and s ¼ 15 or normal with m ¼ 100 and s ¼ 12 or
9.5 Some Comments on Selecting a Test Procedure 469