Abstract
In this chapter, we study the optimization of the long-run average of multi-class time-nonhomogeneous Markov chains (TNHMCs). We show that with confluencity, state classification, and relative optimization, we can obtain the necessary and sufficient conditions for optimal policies of the average reward of TNHMCs consisting of multiple confluent classes (multi-chains). The optimality conditions do not need to hold in any finite period, or “non-frequently visited” time sequence. In the analysis, we assume that the limit of the average exists. In general, the performance should be defined as the “liminf” of the average. However, because of the non-linear property of “liminf”, it is not well-defined for branching states, unless the TNHMC is “asynchronous” among different confluent classes. This property is also studied.
| Original language | English |
|---|---|
| Title of host publication | SpringerBriefs in Control, Automation and Robotics |
| Publisher | Springer |
| Pages | 59-78 |
| Number of pages | 20 |
| DOIs | |
| Publication status | Published - 2021 |
Publication series
| Name | SpringerBriefs in Control, Automation and Robotics |
|---|---|
| ISSN (Print) | 2192-6786 |
| ISSN (Electronic) | 2192-6794 |
Bibliographical note
Publisher Copyright:© 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.