Exploratory Analysis of Geochemical Anomalies 79
input variables. Thus, the first factor accounts for the highest proportion of the ‘total’
common variance in the input multivariate data, whereas the k
th
(or last) factor accounts
for the least proportion of the ‘total’ common variance in the input multivariate data.
Because the ‘total’ common variance in n input multivariate data is unknown, the
‘optimum’ k common factors must be determined by following a number of statistical
tests (Basilevsky, 1994) or ‘rule-of-thumb’ criteria, e.g., factors that cumulatively
account for at least 70% of the total variance (Reimann et al., 2002). From the foregoing
discussion, the following can be said about the applicability of either PCA or FA in
geochemical data analysis (cf. Howarth and Sinding-Larsen, 1983). On the one hand,
PCA is favourable in cases of geochemical data analysis in which the range of PCs
representing the ‘most common’ variance to the ‘most specific’ variance in the input
multi-element data sets is of interest to allow recognition of latent inter-element
variations that reflect the various geochemical processes in a study area. On the other
hand, FA is favourable in cases of geochemical data analysis in which the factors
representing the ‘most common’ variance in the input multi-element data sets are of
interest to allow recognition of latent inter-element relationships that describe the
different geochemical processes in a study area.
Therefore, based on the preceding discussion about the difference between PCA and
FA, the former is considered more appropriate to apply in the case study than the latter
because of its ‘exploratory’ rather than ‘confirmatory’ nature. In PCA, it is essential to
use standardised data if the correlation matrix is used to derive the PCs or to use
unstandardised data if the covariance matrix is used to derive the PCs (Trochimczyk and
Chayes, 1978). In addition, because estimates of either the correlation coefficient or the
covariance are influenced by data form, presence of censored values, outliers and more
than one population, it is also essential to ‘clean’ and transform the data so that they
approach a (nearly) symmetrical distribution. For the same log
e
-transformed uni-element
data sets that were used to create the scatterplots in Fig. 3-18D, the correlation matrix in
Table 3-V and the covariance matrix in Table 3-VI, the derived PCs are shown in Table
3-VII. The correlation matrix (Table 3-V) was used to derive the PCs, so the uni-element
data sets were first standardised using equation (3.11).
In Table 3-VII, the first two PCs (PC1 and PC2) together explain the ‘most common’
variance in the multivariate data and thus represent multi-element associations that
reflect the major geochemical processes in the study area. PC1 accounts for at least 58%
of the total variance and represents a Co-Ni-Zn-Mn-As-Cu association, which reflects a
plausible combination (or overprinting) of lithologic and chemical controls. PC2
explains about 15% of the total variance and represents two antipathetic associations – a
Cu-Ni association reflecting lithologic control and a Mn-Zn association reflecting metal
scavenging control by Mn-oxides. Each of the last four PCs (PC3-PC6) explains the
specific variances in the multivariate data and represents multi-element associations that
reflect either the minor (or subtle) geochemical processes in the study area or errors in
the multivariate data. PC3 accounts for at least 11% of the total variance and represents
two antipathetic associations – an As-dominated multi-element association reflecting
mineralisation control and a Co-Ni association reflecting lithologic control. The last