10.5 Approximate Dynamic Programming and Special Cases 441
s. t. W
t
x
t
= y
t
,
T
t
x
t
−y
t+1
= 0 ,
ξ
t
≥ x
t
≥ 0 .
(5.8)
If Q
t+1
(y
t+1
) is linear with coefficients,
¯
Q
t+1
(i) in each component i of y
t+1
as
it is for t = H −1 , then the optimal solution to (5.8) is given by the increasing or-
dering of c
t, f
(ij)+
ˆ
Q
t+1
( j) with each successive x
t, f
(ij) used up to the minimum
of y
t
(i) and
ξ
t
(ij) according to this realization of ξ
t
. The key is then to construct
a linear approximation to Q
t+1
(y
t+1
) .
With a linearization, the entire strategy can be simply carried back to the first
period. As in other ADP methods, this represents a feasible but not optimal strategy
because it avoids calculating the full nonlinear value function. One way to compute
the linearization is to assume an input value ˆy
t
(i) and to find the probability of each
option multiplied by the expected linearized value of that option. Using this to de-
termine the recourse value at each stage can lead to a lower bound at each stage and
overall when the first-period problem is solved (see Exercise 4). An upper bound-
ing linearization is also possible. This is analogous to the Edmundson-Madansky
approach (Exercise 5).
Frantzeskakis and Powell [1993] mention that extensions of nodal recourse can
apply to general network problems. These procedures are similar to the separable
bounding procedures presented next. They again rely on building responses to ran-
dom variation that depend separately on the random components and that are also
feasible.
c. Piecewise-linear separable bounds
Another approach to ADP is to extend the basic separable bounds presented in Sec-
tion 8.5b. to multistage problems. The main idea is to use the two-stage method
repeatedly to approximate the objective function by separable functions (and not
just single affine functions as in (5.2)). For linear problems, this leads to sublin-
ear or piecewise linear functions as in Section 8.5b. Functions without recession
directions (e.g., quadratic functions) would require some type of nonlinear (e.g.,
quadratic) function that should again be easily integrable, requiring, for example,
limited moment information (second moments for quadratic functions). We con-
sider the linear case (following Birge [1989]).
The goal is to construct a problem that is separable in the components of the
random vector. In each period t , a decision, x
t
, is made subject to the constraints,
W
t
x
t
=
ξ
t
−T
t−1
x
t−1
, x
t
≥ 0,where
ξ
t
is the realization of random constraints
and x
t−1
was the decision in period t −1 . The objective contribution from this
decision is c
t
x
t
. We can view this decision as a response to the input,
η
t
=
ξ
t
−
T
t−1
x
t−1
. The period t decision, x
t
, then becomes a function of this input, so
x
t
(
ω
) becomes x
t
(η
t
) . Problem (2.2) becomes