ﻻ يوجد ملخص باللغة العربية
Maximum likelihood constraint inference is a powerful technique for identifying unmodeled constraints that affect the behavior of a demonstrator acting under a known objective function. However, it was originally formulated only for discrete state-action spaces. Continuous dynamics are more useful for modeling many real-world systems of interest, including the movements of humans and robots. We present a method to generate a tabular state-action space that approximates continuous dynamics and can be used for constraint inference on demonstrations that obey the true system dynamics. We then demonstrate accurate constraint inference on nonlinear pendulum systems with 2- and 4-dimensional state spaces, and show that performance is robust to a range of hyperparameters. The demonstrations are not required to be fully optimal with respect to the objective, and the most likely constraints can be identified even when demonstrations cover only a small portion of the state space. For these reasons, the proposed approach may be especially useful for inferring constraints on human demonstrators, which has important applications in human-robot interaction and biomechanical medicine.
The Reward-Biased Maximum Likelihood Estimate (RBMLE) for adaptive control of Markov chains was proposed to overcome the central obstacle of what is variously called the fundamental closed-identifiability problem of adaptive control, the dual control
We consider the problem of reward learning for temporally extended tasks. For reward learning, inverse reinforcement learning (IRL) is a widely used paradigm. Given a Markov decision process (MDP) and a set of demonstrations for a task, IRL learns a
Applying standard statistical methods after model selection may yield inefficient estimators and hypothesis tests that fail to achieve nominal type-I error rates. The main issue is the fact that the post-selection distribution of the data differs fro
We present a framework for bi-level trajectory optimization in which a systems dynamics are encoded as the solution to a constrained optimization problem and smooth gradients of this lower-level problem are passed to an upper-level trajectory optimiz
We derive Laplace-approximated maximum likelihood estimators (GLAMLEs) of parameters in our Graph Generalized Linear Latent Variable Models. Then, we study the statistical properties of GLAMLEs when the number of nodes $n_V$ and the observed times of