Abstract
Finding optimal policies for general partially observable Markov decision processes (POMDPs) is computationally difficult primarily due to the need to perform dynamic-programming (DP) updates over the entire belief space. In this paper,wefirst study a somewhat restrictive class of specialPOMDPscalled almostdiscernible POMDPs and propose an anytime algorithm called space-progressive value iteration(SPVI). SPVI does not perform DP updates over the entire belief space. Rather it restricts DP updates to a belief subspace that grows over time. It is argued that given sufficient time SPVI can find near-optimal policies for almostdiscernible POMDPs.We then show how SPVI can be applied to more a general class of POMDPs. Empirical results are presented to show the effectiveness of SPVI.
| Original language | English |
|---|---|
| Title of host publication | Symbolic and Quantitative Approaches to Reasoning with Uncertainty - 6th European Conference, ECSQARU 2001, Proceedings |
| Editors | Salem Benferhat, Philippe Besnard |
| Publisher | Springer Verlag |
| Pages | 72-83 |
| Number of pages | 12 |
| ISBN (Electronic) | 3540424644, 9783540424642 |
| DOIs | |
| Publication status | Published - 2001 |
| Event | 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU 2001 - Toulouse, France Duration: 19 Sept 2001 → 21 Sept 2001 |
Publication series
| Name | Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) |
|---|---|
| Volume | 2143 |
| ISSN (Print) | 0302-9743 |
Conference
| Conference | 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU 2001 |
|---|---|
| Country/Territory | France |
| City | Toulouse |
| Period | 19/09/01 → 21/09/01 |
Bibliographical note
Publisher Copyright:© Springer-Verlag Berlin Heidelberg 2001.