أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Michel D. Ingham

135 - Mohamadreza Ahmadi , Masahiro Ono , Michel D. Ingham 2019

We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To ov ercome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk.

علم الروبوتات الذكاء الاصطناعي التحسين والتحكم

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد