
28Amaro Forests - Chap 24  1/8/03  11:53 am  Page 285
285 Procedures for Validating Growth and Yield Models 
validation. Many forest modellers have adopted such an approach. It is interesting 
to note that the reported outcomes from such an approach always appear the same: 
the model is good on both the fitting and validation data sets. There is hardly any 
reporting of a failed model from such an approach although, in reality, one tends to 
find that many models behave undesirably in application. 
What is wrong here? The ‘trick’ is the ‘data-splitting scheme’. Because the split 
data sets ar
e not independent of each other, the data-splitting scheme used in model 
validation is not validating a fitted model; instead, it validates the sampling tech-
nique used to partition the data. Since there is no standard procedure on how the 
two portions of the data should be partitioned, various alternatives can be used: 
1.  The data can be split by 50–50%, 75–25%, 80–20% or any other proportion as the 
modeller sees fit. 
2.  Various sampling techniques can also be used to derive two representative sam-
ple portions. 
3. Assuming that the data were to be split 50–50% randomly, the modeller could 
r
epeatedly split the data in endless ways to derive the ‘correct’ 50–50% split. There 
is a strong possibility that, in practice, this kind of approach can be easily misused 
due to its lack of consistency and repeatability. In fact, it may also be ‘manipulated’, 
as one can keep sampling until the ‘right’ sample comes up. 
4.  The sampling proportion is large. Even for an 80–20% split, 20% of the popula-
tion is sampled for model validation. This is a huge per
centage considering that 
most of the forest inventories in Alberta and elsewhere sample much less than 1% of 
the population. Sampling 50% of the population in a 50–50% split is half way to 
‘census’ instead of ‘sampling’. It provides the favourable proportions for mirroring 
the population either way, leading to the best possible illusion for model validation. 
In addition to data splitting, other procedures might also be used to evaluate 
the goodness of model pr
ediction. These include: the conditional mean squared 
error of prediction (C
p
), the PRESS statistic, Amemiya’s statistic, various resampling 
methods with the funny names of ‘bootstrap’ and ‘jackknife’, and Monte Carlo sim-
ulations (Judge et al., 1988; Dividson and Mackinnon, 1993). All these procedures are 
correct in their own right. For instance, the C
p 
and Amemiya’s statistic are similar to 
other goodness of fit measures. The PRESS statistic is similar to data splitting. The 
resampling methods are used when appropriate sampling results are not available 
and one needs a non-parametric method of estimating measures of precision (Judge 
et al., 1988). Through resampling the estimated errors after a model has been fitted 
to the data, some ‘pseudo sample data’ are generated to emulate the modelling data, 
which permit the re-fit of the model. Monte Carlo simulations involving the pseudo 
sample data are used to approximate the unobservable sampling distributions and 
provide information on simulated sampling variability, confidence intervals, biases, 
etc. All these procedures can provide some informative statistics and can be of use 
for looking at a model from different angles, but their utility in model validation is 
quite dubious and not clearly understood, for they are heavily dependent on the 
model-fitting data. This dependence is not consistent with the prerequisite of vali-
dating a model on independent data set(s). 
While recognizing that data splitting is useful for other purposes (Picard and 
Berk, 1990), it was felt that because of the variations and the potential subjectivity 
r
elated to data splitting (re-creation), the practice of splitting the data into two parts 
should not be used further in validating forestry models, for the reserved data are 
not independent of the modelling data and there are numerous ways in which the 
data can be chosen to substantiate a modeller’s own objectives and, sometimes, bias. 
In some ways, the fact that there is hardly any reporting of a failed model from this