126 8 Multiple Linear Regression Methods
in the coupling, and it will give an indication of the strength of the relation between
one field and the other. The method based on the matrices A and B obtained via
the Least Squares method will be denoted in the following as the PRO method.
This is reminiscent of the name, PROcrustes problem, commonly employed for this
formulation in the climatology community, although the true Procrustes problem
requires additional constraints, and it is discussed in Sect. 8.2.2.
It is interesting to note that the method based on the pseudoinverse can be applied
to any pair of fields. We also have made no assumptions regarding the geographical
location of the data we are using for the analysis. The fields S and Z could be located
in the same geographical domain or they could be placed in remote locations distant
from each other. We might consider, for instance, the geopotential and the SST in the
same domain in the tropics, or we might take the tropical SST and the geopotential
over North America. In the former case we are looking at local relation between
the fields, in the latter case we are really looking at remote influences, probably
mediated by other physical processes.
8.2 A Practical PRO Method
The cost of the calculation described in the preceding section depends strongly on
the order of the data matrices. It is a function of the row and column dimension of
Z and S. The cross-correlation matrix ZS
T
, that is the essential part of A,mayhave
very large dimension if many grid points are considered. In our case, its dimensions
are p q. In some applications it is not a problem, but for a typical climate or mete-
orological application the number of grid points can quickly run into the thousand,
making the calculation of ZS
T
unpractical.
A significant simplification of the calculation can be achieved by using the data
compression properties of the EOF. Using the EOF we have introduced in previous
chapters we can achieve a significant reduction in the problem size to be solved. The
maximum number of EOF for a data field, say Z, is given by the smaller dimension
of Z. In typical meteorological applications the number of time levels is often much
smaller than the number of grid points and we can reduce the problem significantly.
If the columns of U
Z
, U
S
contain the EOFs of the two fields Z, S, respectively,
that is, their left singular vectors, then we know from Chap. 4 that
Z D U
Z
Q
Z; S D U
S
Q
S:
We can then define the significantly smaller problem in terms of the EOF
coefficients as
min
A
k
Q
Z A
Q
Sk
F
:
Its solution can be found in a similar way in terms of the tilde quantities as
A D
Q
Z
Q
S
T
.
Q
S
Q
S
T
/
1
: (8.7)