TY - JOUR
T1 - Optimal control of ergodic continuous-time Markov chains with average sample-path rewards
AU - Guo, Xianping
AU - Cao, Xi Ren
PY - 2006
Y1 - 2006
N2 - In this paper we study continuous-time Markov decision processes with the average sample-path reward (ASPR) criterion and possibly unbounded transition and reward rates. We propose conditions on the system's primitive data for the existence of e-ASPR-optimal (deterministic) stationary policies in a class of randomized Markov policies satisfying some additional continuity assumptions. The proof of this fact is based on the time discretization technique, the martingale stability theory, and the concept of potential. We also provide both policy and value iteration algorithms for computing, or at least approximating, the e-ASPR-optimal stationary policies. We illustrate with examples our main results as well as the difference between the ASPR and the average expected reward criteria.
AB - In this paper we study continuous-time Markov decision processes with the average sample-path reward (ASPR) criterion and possibly unbounded transition and reward rates. We propose conditions on the system's primitive data for the existence of e-ASPR-optimal (deterministic) stationary policies in a class of randomized Markov policies satisfying some additional continuity assumptions. The proof of this fact is based on the time discretization technique, the martingale stability theory, and the concept of potential. We also provide both policy and value iteration algorithms for computing, or at least approximating, the e-ASPR-optimal stationary policies. We illustrate with examples our main results as well as the difference between the ASPR and the average expected reward criteria.
KW - Average sample-path reward
KW - Continuous-time Markov chain
KW - Optimal stationary policy
KW - Policy and value iteration algorithms
UR - https://www.webofscience.com/wos/woscc/full-record/WOS:000231665600002
UR - https://www.scopus.com/pages/publications/33244489385
U2 - 10.1137/S036012903420875
DO - 10.1137/S036012903420875
M3 - Journal Article
SN - 0363-0129
VL - 44
SP - 29
EP - 48
JO - SIAM Journal on Control and Optimization
JF - SIAM Journal on Control and Optimization
IS - 1
ER -