
94 2 Phenomenological Models
‘‘b->h1’’ refers to the bias added by the hidden layer node 1, which is b
h1
in the
notation of Equation 2.71, and hence 2.74 tells us that b
h1
= 124.29. ‘‘i1->h1’’
refers to the weight used by the hidden layer node 1 to multiply the value of input
node 1, which is w
i1;h1
in the notation of Equation 2.71, and hence 2.74 tells us that
w
i1;h1
=−125.32. Note that ‘‘i1->o’’ is the weight of the skip-layer connection
(i.e. we have w
i1;o1
= 48.32).
Note that the results in Figure 2.10 are very similar to the results obtained above
using a sinusoidal nonlinear regression function (Figure 2.6). The difference is
that in this case the sinusoidal pattern in the data was correctly found by the neural
network without the need to find an appropriate expression of the regression
function before the analysis is performed (e.g. based on a graphical analysis of the
data as above). As explained above, this is particularly relevant in situations where
it is hard to get an appropriate expression of the regression function, for example,
when we are concerned with more than two input quantities where graphical plots
involving the response variable and all input quantities are unavailable. The RSQ
produced by the network shown in Figure 2.10 (RSQ = 103.41) is slightly better
than the one obtained for the nonlinear regression function in Figure 2.6 (RSQ
= 105.23). Comparing these two figures in detail, you will note that the shape
of the neural network in Figure 2.10 is not exactly sinusoidal: its values around
1940 exceed its maximum values around 1925. This underlines the fact that neural
networks are governed by the data only (if sufficient nodes in the hidden layer are
used): the neural network in Figure 2.10 describes an almost sinusoidal shape,
but it also detects small deviations from a sinusoidal shape. In this sense, neural
networks have the potential to perform better compared to nonlinear regression
functions such as the one used in Figure 2.6 which is restricted to an exact
sinusoidal shape.
Note 2.5.6 (Automatic detection of nonlinearities) Neural networks describe
the nonlinear dependency of the response variable on the explanatory variables
without a previous explicit specification of this nonlinear dependency (which is
required in nonlinear regression, see Section 2.4).
The
nnet command determines the parameters of the network by a minimization
of an appropriate fitting criterion [45, 64]. Using the default settings, the RSQ will be
used in a way similar to the above discussion in Sections 2.2 and 2.4. The numerical
algorithm that works inside
nnet thus minimizes e.g. RSQ as a function of the
parameters of the neural network, Equation 2.71, that is, as a function of the weights
w
ik;oj
, w
hl,oj
, w
ik;hl
and of the biases b
oj
and b
hl
. Formally, this is the minimization of
a function of several variables, and you know from calculus that if a particular value
of the independent variable is a local minimum of such a function, the Hessian
matrix at that point is positive definite, which means that the eigenvalues of the
Hessian matrix at that point are positive [65]. In line 6 of 2.73, the eigenvalues of
the Hessian matrix are computed (referring to the particular weights and biases
found by
nnet), and the result corresponding to the neural network in Figure 2.10