-- A FUNCTION- LEARNING BACKPROPAGATION NETWORK
-----------------------------------------------------------------------------------------------
-- note that the submodel definition does not depend on nx, ny, nv
--
ARRAY x$[1], y$[1], v$[1], W1$[1, 1], W2$[1, 1]
SUBMODEL NET2(x$, y$, v$, W1$, W2$)
Vector v$ = tanh(W1$ * x$)
Vector y$ = W2$ * v$
end
----------------------------------------------------------------------------------------------
nx = 1 | ny = 1 | nv = 5 | -- nv is the number of hidden neurons
--
ARRAY x[nx] + x0[1] = xx | x0[1] = 1 | -- introduce bias
ARRAY v[nv], y[ny], target[ny], error[ny], delta2[nv]
ARRAY WW1[nv, nx + 1], W2[ny, nv], Dww1[nv, nx + 1], Dw2[ny, nv]
--
-- random initial weights
for i = 1 to nv
WW1[i, 1] = 0.2 * ran() | WW1[i, 2] = 0.2 * ran() | W2[1, i] = 0.2 * ran()
next
----------------------------------------------------------------------------------------------
-- set experiment parameters
lrate1 = 1 | lrate2 = 0.3 | mom1 = 0.1 | mom2 = 0.1
scale = 0.5 | NN = 10000
--
for i = 1 to 3 | drun | next | -- training runs
lrate1 = 0.4 | lrate2 = 0.15 | -- decrease lrate
for i = 1 to 10 | drun | next
--
write "type go for a recall run" | STOP
drun RECALL
------------------------------------------------------------------
DYNAMIC
------------------------------------------------------------------
x[1] = ran() | target[1] = 0.4 * sin(4 * x[1])
invoke NET2(xx, y, v, WW1, W2)
------------------------------------------------- training
Vector error = target - y
Vector delta2 = W2% * error * (1 - v^2)
MATRIX Dww1 = lrate1 * delta2 *xx + mom1 * Dww1
MATRIX Dw2 = lrate2 * error * v + mom2 * Dw2
DELTA WW1 = Dww1 | DELTA W2 = Dw2
------------------------------------------------------------------
--
label RECALL
x[1] = ran() | target[1] = 0.4 * sin(4 * x[1])
invoke NET2(xx, y, v, WW1, W2)
Vector error = target - y
FIGURE 6-4
a
. Training program and recall test for a two-layer backpropagation network
learning the sine function by mean-square regression of a random input on the target function
0.4 * sin(4 * x[1]). The network (but in this case not the training program) is defined as a con-
venient submodel that can be stored and reused with different input, output, and hidden-layer
dimensions
nx, ny, nv. xx, WW1, and Dww1 are bias-augmented arrays (Section 6-2).