147x Filetype PDF File size 0.33 MB Source: www.artima.com
Chapter 2 Dynamic Programming and Optimal Control of Discrete-Time Systems 2.1 Closed Loop Optimization Example: Inventory Control Weconsider the minimization of the expected cost of ordering quantities in order to meet a stochastic demand. Rule: The ordering is only possible at discrete time instants t0 < t1 < ··· < tN−1; N ∈ lN: Excess demand is backlogged and filled as additional inventory becomes available. x : available stock k uk : order at time tk; 0 ≤ k ≤ N−1 wk : demand Hence, the stock evolves according to the discrete-time system x =x +u −w ; 0≤k≤N−1: k+1 k k k Costs at time tk : r(x ) holding cost (excess inventory) or shortage cost (unsatisfied demand) k c purchasing cost (cost per ordered unit) Terminal cost: R(x ) left over inventory N Total cost: N−1 E(R(x )+ X (r(x )+cu )): N k k k=0 Optimal Control of Discrete-Time Systems (DTS) Consider the minimization problem N−1 (DTS) minJ (x ); J (x ) := E(g (x )+ X g (x ;µ (x );w )) π∈Π π 0 π 0 N N k k k k k k=0 subject to x =f (x ;µ (x );w ); 0 ≤ k ≤ N−1 k+1 k k k k k gk : Sk × Ck × Dk → lR; 0 ≤ k ≤ N−1 cost functionals gN : SN → lR terminal cost f : S ×C ×D →S ; 0≤k≤N−1 discrete dynamics k k k k k+1 S ; 0 ≤ k ≤ N state spaces x ∈S states k k k C ; 0 ≤ k ≤ N−1 control spaces u ∈U (x )⊂C controls k k k k k Dk; 0 ≤ k ≤ N disturbance spaces wk∈Dk disturbances (random) π = {µ0;···;µN−1} control policy µk : Sk → Sk; 0 ≤ k ≤ N−1 control laws Π set of admissible control policies ∗ J (x ) = minJ (x ) optimal cost function (optimal value function) 0 π∈Π π 0 ∗ ∗ ∗ ∗ If there exists π ∈ Π such that J (x ) = J (x ); then π is called an optimal policy. π 0 0 Open-loop minimization (ordering decisions are made at time t0) N−1 minimize E(R(x )+ X (r((x )+cu )) N k k k=0 subject to x =x +u −w ; 0≤k≤N−1 k+1 k k k Closed-loop minimization (ordering decisions are made at time tk) Determine a control policy π = {µ }N−1; µ = µ (x ) such that k k=0 k k k N−1 minimize J (x ) = E(R(x )+ X (r(x )+cµ (x ))) π 0 N k k k k=0 subject to x =x +µ (x )−w ; 0≤k≤N−1 k+1 k k k k Dynamic Programming is about the solution of closed-loop minimization problems.
no reviews yet
Please Login to review.