ﻻ يوجد ملخص باللغة العربية
This paper describes the structure of optimal policies for infinite-state Markov Decision Processes with setwise continuous transition probabilities. The action sets may be noncompact. The objective criteria are either the expected total discounted and undiscounted costs or average costs per unit time. The analysis of optimality equations and inequalities is based on the optimal selection theorem for inf-compact functions introduced in this paper.
Natural conditions sufficient for weak continuity of transition probabilities in belief MDPs (Markov decision processes) were established in our paper published in Mathematics of Operations Research in 2016. In particular, the transition probability
This paper studies average-cost Markov decision processes with semi-uniform Feller transition probabilities. This class of MDPs was recently introduced by the authors to study MDPs with incomplete information. This paper studies the validity of optim
This paper deals with control of partially observable discrete-time stochastic systems. It introduces and studies the class of Markov Decision Processes with Incomplete information and with semi-uniform Feller transition probabilities. The important
We propose a new, nonparametric approach to learning and representing transition dynamics in Markov decision processes (MDPs), which can be combined easily with dynamic programming methods for policy optimisation and value estimation. This approach m
We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses. We propose an efficient algorithm that achieves $mathcal{tilde{O}}(L|X|sqrt{|A|T})$