Journal of Dynamics and Games (JDG)

A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces
Pages: 261 - 278, Issue 3, July 2016

doi:10.3934/jdg.2016014      Abstract        References        Full text (464.1K)           Related Articles

Óscar Vega-Amaya - Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico (email)
Joaquín López-Borbón - Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico (email)

1 A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes, SIAM Journal on Control and Optimization, 47 (2008), 2303-2347.       
2 E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations, European Journal of Operational Research, 211 (2011), 343-351.       
3 D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.       
4 D. P. Bertsekas, Approximate policy iteration: A survey and some new methods, Journal of Control Theory and Applications, 9 (2011), 310-335.       
5 D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, MA, 1996.
6 L. Beutel, H. Gonska and D. Kacsó, On variation-diminishing Shoenberg operators: new quantitative statements, in Multivariate approximation and interpolations with applications (ed. M. Gasca), Monografías de la Academia de Ciencias de Zaragoza, 20 (2002), 9-58.       
7 R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293, Springer-Verlag, Berlin Heidelberg, 1972.       
8 F. Dufour and T. Prieto-Rumeau, Approximation of infinite horizon discounted cost Markov decision processes, in Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo Hernández-Lerma (eds. D. Hernández-Hernández and J. A. Minjárez-Sosa), Birkhäuser, (2012), 59-76.       
9 D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning, Journal of Optimization Theory and Applications, 105 (2000), 589-608.       
10 G. J. Gordon, Stable function approximation in dynamic programming, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, 1995, 261-268.
11 O. Hernández-Lerma, Adaptive Markov Control Processes, Springer-Verlag, NY, 1989.       
12 O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes. Basic Optimality Criteria, Springer-Verlag, NY, 1996.       
13 O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, NY, 1999.       
14 D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions, Operations Research, 63 (2015), 1489-1511.       
15 R. Munos, Performance bounds in $L_p$ -norm for approximate value iteration, SIAM Journal on Control and Optimization, 46 (2007), 541-561.       
16 W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality, John Wiley & Sons Inc., 2007.       
17 W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications, Journal of Control Theory and Applications, 9 (2011), 336-352.       
18 W. P. Powell, Perspectives of approximate dynamic programming, Annals of Operations Research, 241 (2016), 319-356.       
19 M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, NY, 1994.       
20 J. Rust, Numerical dynamic programming in economics, in Handbook of Computational Economics, (eds. H. Amman, D. Kendrick and J. Rust), Elsevier, 1 (1996), 619-729.       
21 J. Stachurski, Continuous state dynamic programming via nonexpansive approximation, Computational Economics, 31 (2008), 141-160.
22 J. Stachurski, Dynamic Economic: Theory and Computation, MIT Press, Cambridge, MA, 2009.       
23 S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues, Queueing Systems, 13 (1993), 291-314.       
24 R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
25 B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation, Mathematics of Operations Research, 31 (2006), 234-244.       
26 O. Vega-Amaya and R. Montes-de-Oca, Application of average dynamic programming to inventory systems, Mathematical Methods of Operations Research, 47 (1998), 451-471.       
27 D. J. White, Real applications of Markov decision processes, Interfaces, 15 (1985), 73-83.
28 D. J. White, Further real applications of Markov decision processes, Interfaces, 18 (1988), 55-61.
29 D. J. White, A survey of applications of Markov decision processes, The Journal of the Operational Research Society, 44 (1993), 1073-1096.
30 S. Yakowitz, Dynamic programming applications in water resources, Water Resource Research, 18 (1982), 673-696.
31 W. W.-G. Yeh, Reservoir management and operation models: A state-of-the-art review, Water Resource Research, 21 (1985), 1797-1818.

Go to top