Semi-Markov decision problems and performance sensitivity analysis

Xi Ren Cao*

*Corresponding author for this work

Research output: Contribution to journalJournal Articlepeer-review

90 Citations (Scopus)

Abstract

Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop PA theory for semi-Markov processes (SMPs); and second, we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes (SMDPs) can be derived based on performance potential and realization matrix. Both the long-run average and discounted-cost problems are considered; this approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted factor being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.

Original languageEnglish
Pages (from-to)758-769
Number of pages12
JournalIEEE Transactions on Automatic Control
Volume48
Issue number5
DOIs
Publication statusPublished - May 2003

Keywords

  • Discounted Poisson equations
  • Discrete-event dynamic systems (DEDS)
  • Lyapunov equations
  • Markov decision processes (MDPs)
  • Perturbation analysis (PA)
  • Perturbation realization
  • Poisson equations
  • Policy iteration
  • Potentials
  • Reinforcement learning (RL)

Fingerprint

Dive into the research topics of 'Semi-Markov decision problems and performance sensitivity analysis'. Together they form a unique fingerprint.

Cite this