ترغب بنشر مسار تعليمي؟ اضغط هنا

Situationally Aware Options

52   0   0.0 ( 0 )
 نشر من قبل Daniel J Mankowitz
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Hierarchical abstractions, also known as options -- a type of temporally extended action (Sutton et. al. 1999) that enables a reinforcement learning agent to plan at a higher level, abstracting away from the lower-level details. In this work, we learn reusable options whose parameters can vary, encouraging different behaviors, based on the current situation. In principle, these behaviors can include vigor, defence or even risk-averseness. These are some examples of what we refer to in the broader context as Situational Awareness (SA). We incorporate SA, in the form of vigor, into hierarchical RL by defining and learning situationally aware options in a Probabilistic Goal Semi-Markov Decision Process (PG-SMDP). This is achieved using our Situationally Aware oPtions (SAP) policy gradient algorithm which comes with a theoretical convergence guarantee. We learn reusable options in different scenarios in a RoboCup soccer domain (i.e., winning/losing). These options learn to execute with different levels of vigor resulting in human-like behaviours such as `time-wasting in the winning scenario. We show the potential of the agent to exit bad local optima using reusable options in RoboCup. Finally, using SAP, the agent mitigates feature-based model misspecification in a Bottomless Pit of Death domain.



قيم البحث

اقرأ أيضاً

496 - Mathias Niepert 2013
The Rao-Blackwell theorem is utilized to analyze and improve the scalability of inference in large probabilistic models that exhibit symmetries. A novel marginal density estimator is introduced and shown both analytically and empirically to outperfor m standard estimators by several orders of magnitude. The developed theory and algorithms apply to a broad class of probabilistic models including statistical relational models considered not susceptible to lifted probabilistic inference.
Most conversational recommendation approaches are either not explainable, or they require external users knowledge for explaining or their explanations cannot be applied in real time due to computational limitations. In this work, we present a real t ime category based conversational recommendation approach, which can provide concise explanations without prior user knowledge being required. We first perform an explainable user model in the form of preferences over the items categories, and then use the category preferences to recommend items. The user model is performed by applying a BERT-based neural architecture on the conversation. Then, we translate the user model into item recommendation scores using a Feed Forward Network. User preferences during the conversation in our approach are represented by category vectors which are directly interpretable. The experimental results on the real conversational recommendation dataset ReDial demonstrate comparable performance to the state-of-the-art, while our approach is explainable. We also show the potential power of our framework by involving an oracle setting of category preference prediction.
114 - Yibo Hu , Latifur Khan 2021
Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks. However, they tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution (OOD) exam ples exist. Most research on uncertainty estimation focuses on computer vision because it provides visual validation on uncertainty quality. However, few have been presented in the natural language process domain. Unlike Bayesian methods that indirectly infer uncertainty through weight uncertainties, current evidential uncertainty-based methods explicitly model the uncertainty of class probabilities through subjective opinions. They further consider inherent uncertainty in data with different root causes, vacuity (i.e., uncertainty due to a lack of evidence) and dissonance (i.e., uncertainty due to conflicting evidence). In our paper, we firstly apply evidential uncertainty in OOD detection for text classification tasks. We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class, which has high vacuity for OOD samples. Extensive empirical experiments demonstrate that our model based on evidential uncertainty outperforms other counterparts for detecting OOD examples. Our approach can be easily deployed to traditional recurrent neural networks and fine-tuned pre-trained transformers.
137 - Majid Khonji , Jorge Dias , 2019
A significant barrier to deploying autonomous vehicles (AVs) on a massive scale is safety assurance. Several technical challenges arise due to the uncertain environment in which AVs operate such as road and weather conditions, errors in perception an d sensory data, and also model inaccuracy. In this paper, we propose a system architecture for risk-aware AVs capable of reasoning about uncertainty and deliberately bounding the risk of collision below a given threshold. We discuss key challenges in the area, highlight recent research developments, and propose future research directions in three subsystems. First, a perception subsystem that detects objects within a scene while quantifying the uncertainty that arises from different sensing and communication modalities. Second, an intention recognition subsystem that predicts the driving-style and the intention of agent vehicles (and pedestrians). Third, a planning subsystem that takes into account the uncertainty, from perception and intention recognition subsystems, and propagates all the way to control policies that explicitly bound the risk of collision. We believe that such a white-box approach is crucial for future adoption of AVs on a large scale.
Manufacturing Operations Management (MOM) systems are complex in the sense that they integrate data from heterogeneous systems inside the automation pyramid. The need for context-aware analytics arises from the dynamics of these systems that influenc e data generation and hamper comparability of analytics, especially predictive models (e.g. predictive maintenance), where concept drift affects application of these models in the future. Recently, an increasing amount of research has been directed towards data integration using semantic context models. Manual construction of such context models is an elaborate and error-prone task. Therefore, we pose the challenge to apply combinations of knowledge extraction techniques in the domain of analytics in MOM, which comprises the scope of data integration within Product Life-cycle Management (PLM), Enterprise Resource Planning (ERP), and Manufacturing Execution Systems (MES). We describe motivations, technological challenges and show benefits of context-aware analytics, which leverage from and regard the interconnectedness of semantic context data. Our example scenario shows the need for distribution and effective change tracking of context information.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا