The coefficient of determination, r
2
, shows how much of the variation is
explained by the regression. A plot of the residuals will help indicate if the
regression model is an appropriate analysis for the data (Chapter 16).
The residuals ðy
i
^
y Þ have another use in sequence analysis. By plotting
them against the independent variable X, the regression line of best fit ð
^
y Þ is
converted to a horizontal line where all values of
^
y are equal to zero, with
the residuals dispersed about it. This effectively removes the variation
explained by the line of best fit from the sequence, and has two advantages.
First, if the regression line is a good fit to the data, the plot of the residuals
will now be independent of any long-term trend. This is one way of
detrending a sequence, and the detrended data can be used to investigate
autocorrelation caused by repetition within the sequence without the
confounding effect of any general trend upon the value of r.
Second, if the residuals are not evenly distributed about zero it suggests
there is still variation present that is not accounted for by the line of best fit.
The pattern of the residuals about the line may indicate the characteristics of
this variation, from which you could make a choice of additional terms to
incorporate into the regression equation in an attempt to improve the fitof
the model.
A further step often used in sequence analysis is to draw a correlogram of
the residuals. If the regression is a good description of the data, the values of
r at lags of one or more in this correlogram should only show random
fluctuation around a mean of zero. Significant values of r will indicate any
remaining autocorrelation.
Choosing an appropriate model requires a good understanding of com-
plex regression. Statistical packages can do extremely complex autoregres-
sion analyses, but these models have quite stringent assumptions and are
often misapplied and misinterpreted. Therefore, if the sequence appears
complex it is important to seek expert advice. Here we give straightforward
examples of the use of some regression models.
21.7 Simple linear regression
For a sequence that shows an apparently linear trend over time, as in
Figure 21.1, the correlogram should be similar to Figure 21.3 (c) or (d),
but you might not even draw one for such an obvious relationship. The
sequence could be analyzed using simple linear regression:
21.7 Simple linear regression 309