50 4 Empirical Orthogonal Functions
but if a viable interpretation has to be found, it must rely on a robust determination
of the pattern themselves. The SVD algorithm is robust and reliable algorithm, so
we are not really concerned with mathematical and/or numerical sensitivities, but
with sensitivities deriving from the other possible choices that we can tackle in the
definition of the problem itself. A simple mathematical uncertainty is, for instance,
that eigenvectorsare computed up to a change in sign, as it can be derived from (4.2).
Moreover, data can be normalized in different ways. This operation is often done to
stress one aspect or another of the data, as one may want to consider a different
geographical domain or to analyze a certain area for economy of calculation and
space. In the following we will discuss how the EOF react to this kind of changes.
4.4.1 Normalizing the Data
Data can be normalized in several ways. As we have seen when discussing the
correlation matrix, the most common normalization is the division by the standard
deviation. For the considered multi-variate set this implies dividing by a standard de-
viation that is different for each station. This approach allows us to compare time
series for stations that have large differences in the amplitude of the variability and
focus on the time consistency relation among station time-series. As in previous
sections, in the following we assume that the vector mean has been removed from
the data matrix, so that we can assume that the sample mean is 0. The normalization
of the data can be obtained by dividing each column of the data matrix X in (4.1)by
the standard deviation of each station,
1
;
2
;:::;
m
,thatis
Y D D
1
X
X; with D
X
D diag.
1
; :::;
m
/: (4.5)
We can then proceed to compute EOF on the normalized data matrix Y.Aswe
already observed, the covariance matrix of Y is the correlation matrix of the original
data X. However, by normalizing the original matrix, we can compute the EOF at
once directly using the SVD of Y, without first computing the correlation matrix.
The EOF of Y are sometimes called correlation EOF as opposed to the covari-
ance EOF of the unnormalized data that we have seen in the last section. The main
differences between the two approaches is that the covariance EOF are going to
be biased toward the region of highest standard deviation, so the patterns will try
to optimize as much as possible the variation of the field in those regions. On the
contrary in the correlation EOF, the normalization equalizes the field variations and
so the time series at every station are considered equally important, as a result the
patterns will try to described as much as possible the overall spatial variation of
the field. The standard deviation has a spatial structure (Fig. 4.7) and the effect of
normalization is to reduce the amplitude of the variations in the North Pacific and
North Atlantic, whereas the amplitude is expanded in the other regions.
We show in Figs. 4.8 and 4.9 what happens when computing covariance and cor-
relation EOF on our test data set. The main comment is that the first mode is weakly