On Continuity of Transition Probabilities in Belief MDPs with General State and Action Spaces


الملخص بالإنكليزية

Natural conditions sufficient for weak continuity of transition probabilities in belief MDPs (Markov decision processes) were established in our paper published in Mathematics of Operations Research in 2016. In particular, the transition probability in the belief MDP is weakly continuous if in the original MDP the transition probability is weakly continuous and the observation probability is continuous in total variation. These results imply sufficient conditions for the existence of optimal policies in POMDPs (partially observable MDPs) and provide computational methods for finding them. Recently Kara, Saldi, and Yuksel proved weak continuity of the transition probability for the belief MDP if the transition probability for the original MDP is continuous in total variation and the observation probability does not depend on controls. In this paper we show that the following two conditions imply weak continuity of transition probabilities for belief MDPs when observation probabilities depend on controls: (i) transition probabilities for the original MDP are continuous in total variation, and (ii) observation probabilities are measurable, and their dependence on controls is continuous in total variation.

تحميل البحث