A perturbation approach to a class of discounted approximate value
iteration algorithms with borel spaces
Pages: 261  278,
Issue 3,
July
2016
doi:10.3934/jdg.2016014 Abstract
References
Full text (464.1K)
Related Articles
Óscar VegaAmaya  Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico (email)
Joaquín LópezBorbón  Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico (email)
1 
A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes, SIAM Journal on Control and Optimization, 47 (2008), 23032347. 

2 
E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations, European Journal of Operational Research, 211 (2011), 343351. 

3 
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, PrenticeHall, Englewood Cliffs, NJ, 1987. 

4 
D. P. Bertsekas, Approximate policy iteration: A survey and some new methods, Journal of Control Theory and Applications, 9 (2011), 310335. 

5 
D. P. Bertsekas and J. N. Tsitsiklis, NeuroDynamic Programming, Athena Scientific, Belmont, MA, 1996. 

6 
L. Beutel, H. Gonska and D. Kacsó, On variationdiminishing Shoenberg operators: new quantitative statements, in Multivariate approximation and interpolations with applications (ed. M. Gasca), Monografías de la Academia de Ciencias de Zaragoza, 20 (2002), 958. 

7 
R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293, SpringerVerlag, Berlin Heidelberg, 1972. 

8 
F. Dufour and T. PrietoRumeau, Approximation of infinite horizon discounted cost Markov decision processes, in Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo HernándezLerma (eds. D. HernándezHernández and J. A. MinjárezSosa), Birkhäuser, (2012), 5976. 

9 
D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning, Journal of Optimization Theory and Applications, 105 (2000), 589608. 

10 
G. J. Gordon, Stable function approximation in dynamic programming, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, 1995, 261268. 

11 
O. HernándezLerma, Adaptive Markov Control Processes, SpringerVerlag, NY, 1989. 

12 
O. HernándezLerma and J. B. Lasserre, DiscreteTime Markov Control Processes. Basic Optimality Criteria, SpringerVerlag, NY, 1996. 

13 
O. HernándezLerma and J. B. Lasserre, Further Topics on DiscreteTime Markov Control Processes, SpringerVerlag, NY, 1999. 

14 
D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions, Operations Research, 63 (2015), 14891511. 

15 
R. Munos, Performance bounds in $L_p$ norm for approximate value iteration, SIAM Journal on Control and Optimization, 46 (2007), 541561. 

16 
W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality, John Wiley & Sons Inc., 2007. 

17 
W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications, Journal of Control Theory and Applications, 9 (2011), 336352. 

18 
W. P. Powell, Perspectives of approximate dynamic programming, Annals of Operations Research, 241 (2016), 319356. 

19 
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, NY, 1994. 

20 
J. Rust, Numerical dynamic programming in economics, in Handbook of Computational Economics, (eds. H. Amman, D. Kendrick and J. Rust), Elsevier, 1 (1996), 619729. 

21 
J. Stachurski, Continuous state dynamic programming via nonexpansive approximation, Computational Economics, 31 (2008), 141160. 

22 
J. Stachurski, Dynamic Economic: Theory and Computation, MIT Press, Cambridge, MA, 2009. 

23 
S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues, Queueing Systems, 13 (1993), 291314. 

24 
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998. 

25 
B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation, Mathematics of Operations Research, 31 (2006), 234244. 

26 
O. VegaAmaya and R. MontesdeOca, Application of average dynamic programming to inventory systems, Mathematical Methods of Operations Research, 47 (1998), 451471. 

27 
D. J. White, Real applications of Markov decision processes, Interfaces, 15 (1985), 7383. 

28 
D. J. White, Further real applications of Markov decision processes, Interfaces, 18 (1988), 5561. 

29 
D. J. White, A survey of applications of Markov decision processes, The Journal of the Operational Research Society, 44 (1993), 10731096. 

30 
S. Yakowitz, Dynamic programming applications in water resources, Water Resource Research, 18 (1982), 673696. 

31 
W. W.G. Yeh, Reservoir management and operation models: A stateoftheart review, Water Resource Research, 21 (1985), 17971818. 

Go to top
