Multivariate Utility Optimization with an Application to Risk-Sensitive Partially Observable Markov Decision Processes


الملخص بالإنكليزية

We introduce and treat a class of Multi Objective Risk-Sensitive Markov Decision Processes (MORSMDPs), where the optimality criteria are generated by a multivariate utility function applied on a finite set of emph{different running costs}. To illustrate our approach, we study the example of a two-armed bandit problem. In the sequel, we show that it is possible to reformulate standard Risk-Sensitive Partially Observable Markov Decision Processes (RSPOMDPs), where risk is modeled by a utility function that is a emph{sum of exponentials}, as MORSMDPs that can be solved with the methods described in the first part. This way, we extend the treatment of RSPOMDPs with exponential utility to RSPOMDPs corresponding to a qualitatively bigger family of utility functions.

تحميل البحث