Model-Based Safe Policy Search from Signal Temporal Logic Specifications Using Recurrent Neural Networks

113 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Wenliang Liu

تاريخ النشر 2021

مجال البحث هندسة إلكترونية الهندسة المعلوماتية

والبحث باللغة English

تأليف Wenliang Liu - Calin Belta

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We propose a policy search approach to learn controllers from specifications given as Signal Temporal Logic (STL) formulae. The system model is unknown, and it is learned together with the control policy. The model is implemented as a feedforward neural network (FNN). To capture the history dependency of the STL specification, we use a recurrent neural network (RNN) to implement the control policy. In contrast to prevalent model-free methods, the learning approach proposed here takes advantage of the learned model and is more efficient. We use control barrier functions (CBFs) with the learned model to improve the safety of the system. We validate our algorithm via simulations. The results show that our approach can satisfy the given specification within very few system runs, and therefore it has the potential to be used for on-line control.

قيم البحث

271 - Wenliang Liu , Noushin Mehdipour , Calin Belta 2020

We propose a framework based on Recurrent Neural Networks (RNNs) to determine an optimal control strategy for a discrete-time system that is required to satisfy specifications given as Signal Temporal Logic (STL) formulae. RNNs can store information of a system over time, thus, enable us to determine satisfaction of the dynamic temporal requirements specified in STL formulae. Given a STL formula, a dataset of satisfying system executions and corresponding control policies, we can use RNNs to predict a control policy at each time based on the current and previous states of system. We use Control Barrier Functions (CBFs) to guarantee the safety of the predicted control policy. We validate our theoretical formulation and demonstrate its performance in an optimal control problem subject to partially unknown safety constraints through simulations.

أنظمة وتحكم التعلم الآلي أنظمة وتحكم

Weighted Graph-Based Signal Temporal Logic Inference Using Neural Networks

271 - Nasim Baharisangari , Kazuma Hirota , Ruixuan Yan 2021

Extracting spatial-temporal knowledge from data is useful in many applications. It is important that the obtained knowledge is human-interpretable and amenable to formal analysis. In this paper, we propose a method that trains neural networks to lear n spatial-temporal properties in the form of weighted graph-based signal temporal logic (wGSTL) formulas. For learning wGSTL formulas, we introduce a flexible wGSTL formula structure in which the users preference can be applied in the inferred wGSTL formulas. In the proposed framework, each neuron of the neural networks corresponds to a subformula in a flexible wGSTL formula structure. We initially train a neural network to learn the wGSTL operators and then train a second neural network to learn the parameters in a flexible wGSTL formula structure. We use a COVID-19 dataset and a rain prediction dataset to evaluate the performance of the proposed framework and algorithms. We compare the performance of the proposed framework with three baseline classification methods including K-nearest neighbors, decision trees, and artificial neural networks. The classification accuracy obtained by the proposed framework is comparable with the baseline classification methods.

الذكاء الاصطناعي التعلم الآلي

Robust satisfiability check and online control synthesis for uncertain systems under signal temporal logic specifications

165 - Pian Yu , Yulong Gao , Karl H. Johansson 2021

This paper studies the robust satisfiability check and online control synthesis problems for uncertain discrete-time systems subject to signal temporal logic (STL) specifications. Different from existing techniques, this work proposes an approach bas ed on STL, reachability analysis, and temporal logic trees. Firstly, a real-time version of STL semantics and a tube-based temporal logic tree are proposed. We show that such a tree can be constructed from every STL formula. Secondly, using the tube-based temporal logic tree, a sufficient condition is obtained for the robust satisfiability check of the uncertain system. When the underlying system is deterministic, a necessary and sufficient condition for satisfiability is obtained. Thirdly, an online control synthesis algorithm is designed. It is shown that when the STL formula is robustly satisfiable and the initial state of the system belongs to the initial root node of the tube-based temporal logic tree, it is guaranteed that the trajectory generated by the controller satisfies the STL formula. The effectiveness of the proposed approach is verified by an automated car overtaking example.

أنظمة وتحكم أنظمة وتحكم

Co-design of Control and Planning for Multi-rotor UAVs with Signal Temporal Logic Specifications

153 - Yash Vardhan Pant , He Yin , Murat Arcak 2020

Urban Air Mobility (UAM), or the scenario where multiple manned and Unmanned Aerial Vehicles (UAVs) carry out various tasks over urban airspaces, is a transportation concept of the future that is gaining prominence. UAM missions with complex spatial, temporal and reactive requirements can be succinctly represented using Signal Temporal Logic (STL), a behavioral specification language. However, planning and control of systems with STL specifications is computationally intensive, usually resulting in planning approaches that do not guarantee dynamical feasibility, or control approaches that cannot handle complex STL specifications. Here, we present an approach to co-design the planner and control such that a given STL specification (possibly over multiple UAVs) is satisfied with trajectories that are dynamically feasible and our controller can track them with a bounded tracking-error that the planner accounts for. The tracking controller is formulated for the non-linear dynamics of the individual UAVs, and the tracking error bound is computed for this controller when the trajectories satisfy some kinematic constraints. We also augment an existing multi-UAV STL-based trajectory generator in order to generate trajectories that satisfy such constraints. We show that this co-design allows for trajectories that satisfy a given STL specification, and are also dynamically feasible in the sense that they can be tracked with bounded error. The applicability of this approach is demonstrated through simulations of multi-UAV missions.

أنظمة وتحكم أنظمة متعددة العملاء علم الروبوتات

Communication Topology Co-Design in Graph Recurrent Neural Network Based Distributed Control

282 - Fengjun Yang , Nikolai Matni 2021

When designing large-scale distributed controllers, the information-sharing constraints between sub-controllers, as defined by a communication topology interconnecting them, are as important as the controller itself. Controllers implemented using den se topologies typically outperform those implemented using sparse topologies, but it is also desirable to minimize the cost of controller deployment. Motivated by the above, we introduce a compact but expressive graph recurrent neural network (GRNN) parameterization of distributed controllers that is well suited for distributed controller and communication topology co-design. Our proposed parameterization enjoys a local and distributed architecture, similar to previous Graph Neural Network (GNN)-based parameterizations, while further naturally allowing for joint optimization of the distributed controller and communication topology needed to implement it. We show that the distributed controller/communication topology co-design task can be posed as an $ell_1$-regularized empirical risk minimization problem that can be efficiently solved using stochastic gradient methods. We run extensive simulations to study the performance of GRNN-based distributed controllers and show that (a) they achieve performance comparable to GNN-based controllers while having fewer free parameters, and (b) our method allows for performance/communication density tradeoff curves to be efficiently approximated.

أنظمة وتحكم التعلم الآلي أنظمة وتحكم