
4.5 Reconstruction of the Data 61
exactly one zero singular value when the mean is removed, because the subtraction
of the mean reduces the degrees of freedom of the data by one.
2
The conclusion that we need to retain all modes may in fact be premature. There
is another issue that we need to investigate. We have computed the EOF on the
available data sample, but it is not at all clear how representative of the true EOF
they are. In practice, we have to realize that our computation is only an estimate of
the EOF of the population from which we have extracted a sample, and we need to
investigate how accurate this estimate is. In particular, we need to evaluate the prob-
ability that some of the sigma’s are zero just because of the choice of the members of
the sample. Some sigma’s can really be zero and correspond to degrees of freedom
that do not contribute to the variability of the field, but others may appear nonzero
just because of our particular sampling. An EOF analysis is therefore incomplete
without some consideration on the robustness of the results and their sensitivities to
changes in the sampling or in other aspects of the procedure.
The EOF have identified some patterns corresponding to observation points that
vary together in an organized manner, but each observation point may have variance
that is uncorrelated from other points, from the point of view of the spatial analysis
of variance that is considered noise.
Mathematically the EOF will tend to fit also those components, thus generating a
fictitious pattern. This is one of the reasons that explains why the higher order EOF
have very complex patterns. They try to fit the variance point by point: a desperate
job since it is mostly uncorrelated. This portion of variance is not really interesting,
but we can exploit this property of the EOF, because we can then use it to gener-
ate data that is free of the noise component, simply by reconstructing the data sets
retaining only the higher modes corresponding to covarying modes (cf. Sect. 4.5).
Another example is shown in the following pictures. A two-dimensional wave is
propagating in a square domain from left to right. The wave is a fairly regular sine
wave, but a substantial amount of noise is superposed. At any time the wave pattern
is substantially distorted by the noise (Fig. 4.13). In Fig. 4.14 we display the EOF of
the time evolution, obtained by considering as observation points the local position
at which the wave is observed to pass.
The EOF recover fairly quickly the coherent pattern of the propagating wave
and the first two modes explain most of the total variance. We can also see how
propagation is represented by EOF usually employing two modes that are in quadra-
ture and fairly similar in distribution. This indicates that those modes are two phases
of the same propagating pattern. The noise is relegated to higher modes; having
added a significant amount of noise, these modes are not insignificant. The totally
2
More precisely, using Nx D
1
n
X1,wehave
X Nx1
D X.I
n
1
n
11
/:
Since the matrix I
n
1
n
11
has rank n 1, the relation above shows that .X Nx1
/ has rank not
greater than minfn 1; mg. Therefore, the scaled covariance matrix .X Nx1
/.X Nx1
/
has
rank at most n 1,ifm n.