ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-Objective Controller Synthesis with Uncertain Human Preferences

273   0   0.0 ( 0 )
 نشر من قبل Kayla Boggess
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Multi-objective controller synthesis concerns the problem of computing an optimal controller subject to multiple (possibly conflicting) objective properties. The relative importance of objectives is often specified by human decision-makers. However, there is inherent uncertainty in human preferences (e.g., due to different preference elicitation methods). In this paper, we formalize the notion of uncertain human preferences and present a novel approach that accounts for uncertain human preferences in the multi-objective controller synthesis for Markov decision processes (MDPs). Our approach is based on mixed-integer linear programming (MILP) and synthesizes a sound, optimally permissive multi-strategy with respect to a multi-objective property and an uncertain set of human preferences. Experimental results on a range of large case studies show that our MILP-based approach is feasible and scalable to synthesize sound, optimally permissive multi-strategies with varying MDP model sizes and uncertainty levels of human preferences. Evaluation via an online user study also demonstrates the quality and benefits of synthesized (multi-)strategies.

قيم البحث

اقرأ أيضاً

50 - Wenli Ouyang 2019
Service supply chain management is to prepare spare parts for failed products under warranty. Their goal is to reach agreed service level at the minimum cost. We convert this business problem into a preference based multi-objective optimization probl em, where two quality criteria must be simultaneously optimized. One criterion is accuracy of demand forecast and the other is service level. Here we propose a general framework supporting solving preference-based multi-objective optimization problems (MOPs) by multi-gradient descent algorithm (MGDA), which is well suited for training deep neural network. The proposed framework treats agreed service level as a constrained criterion that must be met and generate a Pareto-optimal solution with highest forecasting accuracy. The neural networks used here are two Encoder-Decoder LSTM modes: one is used for pre-training phase to learn distributed representation of former generations service parts consumption data, and the other is used for supervised learning phase to generate forecast quantities of current generations service parts. Evaluated under the service parts consumption data in Lenovo Group Ltd, the proposed method clearly outperform baseline methods.
The problem of allocating scarce items to individuals is an important practical question in market design. An increasingly popular set of mechanisms for this task uses the concept of market equilibrium: individuals report their preferences, have a bu dget of real or fake currency, and a set of prices for items and allocations is computed that sets demand equal to supply. An important real world issue with such mechanisms is that individual valuations are often only imperfectly known. In this paper, we show how concepts from classical market equilibrium can be extended to reflect such uncertainty. We show that in linear, divisible Fisher markets a robust market equilibrium (RME) always exists; this also holds in settings where buyers may retain unspent money. We provide theoretical analysis of the allocative properties of RME in terms of envy and regret. Though RME are hard to compute for general uncertainty sets, we consider some natural and tractable uncertainty sets which lead to well behaved formulations of the problem that can be solved via modern convex programming methods. Finally, we show that very mild uncertainty about valuations can cause RME allocations to outperform those which take estimates as having no underlying uncertainty.
It is incredibly easy for a system designer to misspecify the objective for an autonomous system (robot), thus motivating the desire to have the robot learn the objective from human behavior instead. Recent work has suggested that people have an inte rest in the robot performing well, and will thus behave pedagogically, choosing actions that are informative to the robot. In turn, robots benefit from interpreting the behavior by accounting for this pedagogy. In this work, we focus on misspecification: we argue that robots might not know whether people are being pedagogic or literal and that it is important to ask which assumption is safer to make. We cast objective learning into the more general form of a common-payoff game between the robot and human, and prove that in any such game literal interpretation is more robust to misspecification. Experiments with human data support our theoretical results and point to the sensitivity of the pedagogic assumption.
92 - Yu Xue , Yihang Tang , Xin Xu 2021
Feature selection (FS) is an important research topic in machine learning. Usually, FS is modelled as a+ bi-objective optimization problem whose objectives are: 1) classification accuracy; 2) number of features. One of the main issues in real-world a pplications is missing data. Databases with missing data are likely to be unreliable. Thus, FS performed on a data set missing some data is also unreliable. In order to directly control this issue plaguing the field, we propose in this study a novel modelling of FS: we include reliability as the third objective of the problem. In order to address the modified problem, we propose the application of the non-dominated sorting genetic algorithm-III (NSGA-III). We selected six incomplete data sets from the University of California Irvine (UCI) machine learning repository. We used the mean imputation method to deal with the missing data. In the experiments, k-nearest neighbors (K-NN) is used as the classifier to evaluate the feature subsets. Experimental results show that the proposed three-objective model coupled with NSGA-III efficiently addresses the FS problem for the six data sets included in this study.
Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-ba sed falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework that is interfaced with both the simulation and sampling capabilities of Scenic and the falsification capabilities of VerifAI, reducing the execution time bottleneck inherently present in simulation-based testing. We then present an extension of VerifAIs falsification algorithms to support multi-objective optimization during sampling, using the concept of rulebooks to specify a preference ordering over multiple metrics that can be used to guide the counterexample search process. Lastly, we evaluate the benefits of these extensions with a comprehensive set of benchmarks written in the Scenic language.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا