consider the estimated standard deviation for the error in predicting the final grade
of a single student with the predictor values 25, 28, 26, and 90. This is
ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
s
2
þ s
2
^
Y
q
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9:897
2
þ 1:882
2
p
¼ 10:074
Therefore, a 95% prediction interval for the final grade of a single student with
predictor scores 25, 28, 26, and 90 is
^
m
Y25;28;26;90
t
:025;805
ð10:074Þ¼85:55 1:992ð10:074Þ¼85:55 20:07
¼ð65:5; 105:6Þ
Of course, this PI is much wider than the corresponding CI. Although we are highly
confident that the expected score is a B, the score for a single student could be as
low as a D or as high as an A. Notice that the upper end of the interval exceeds
the maximum score of 100, so it would be appropriate to truncate the interval to
(65.5, 100) ■
Frequently, the hypothesis of interest has the form H
0
: b
i
¼ 0 for a particular
i. For example, after fitting the four-predictor model in Example 12.25, the
investigator might wish to test H
0
: b
2
¼ 0. According to H
0
, as long as the
predictors x
1
, x
3
, and x
4
remain in the model, x
2
contains no useful information
about y. The test statistic value is the t-ratio
^
b
i
=s
^
b
i
. Many statistical computer
packages report the t-ratio and corresponding P-value for each predictor included
in the model. For example, Figure 12.29 shows that as long as algebra pretest score,
ACT natural science, and high school percentile rank are retained in the model, the
predictor x
2
¼ ACT math score can be deleted. The P-value for x
2
is .55, much too
large to reject the null hypothesi s.
It is interesting to look at the correlations between the predictors and the
response variable in Example 12.25. Here are the correlations and the
corresponding P-values (in parentheses):
alg plc ACTmath ACTns rank
calc grade 0.491 0.353 0.259 0.324
(0.000) (0.0013) (0.020) (0.003)
Do these values seem inconsistent with the multiple regression results? There is a
highly significant correlation between calculus grade and ACT math score, but in
the multiple regression the ACT math scor e is redundant, not needed in the model.
The idea is that ACT math score also has highly significant correlations with the
other predictors, so much of its predictive ability is retained in the model when this
variable is deleted. In order to be a statistically significant predictor in the mul tiple
regression model, a variable must provide additional predictive ability beyond what
is offered by the other predictors.
The R
2
value for the calculus data is disappointing. Given the importance
placed on predictors such as ACT scores and high school rank in college admis-
sions and NCAA eligibility, we might expect that these scores would give better
predictions.
690 CHAPTER 12 Regression and Correlation