No Arabic abstract
In this report for the Nasa NIAC Phase I study, we present a mission architecture and a robotic platform, the Shapeshifter, that allow multi-domain and redundant mobility on Saturns moon Titan, and potentially other bodies with atmospheres. The Shapeshifter is a collection of simple and affordable robotic units, called Cobots, comparable to personal palm-size quadcopters. By attaching and detaching with each other, multiple Cobots can shape-shift into novel structures, capable of (a) rolling on the surface, to increase the traverse range, (b) flying in a flight array formation, and (c) swimming on or under liquid. A ground station complements the robotic platform, hosting science instrumentation and providing power to recharge the batteries of the Cobots. Our Phase I study had the objective of providing an initial assessment of the feasibility of the proposed robotic platform architecture, and in particular (a) to characterize the expected science return of a mission to the Sotra-Patera region on Titan; (b) to verify the mechanical and algorithmic feasibility of building a multi-agent platform capable of flying, docking, rolling and un-docking; (c) to evaluate the increased range and efficiency of rolling on Titan w.r.t to flying; (d) to define a case-study of a mission for the exploration of the cryovolcano Sotra-Patera on Titan, whose expected variety of geological features challenges conventional mobility platforms.
In this paper, a novel and innovative methodology for feasible motion planning in the multi-agent system is developed. On the basis of velocity obstacles characteristics, the chance constraints are formulated in the receding horizon control (RHC) problem, and geometric information of collision cones is used to generate the feasible regions of velocities for the host agent. By this approach, the motion planning is conducted at the velocity level instead of the position level. Thus, it guarantees a safer collision-free trajectory for the multi-agent system, especially for the systems with high-speed moving agents. Moreover, a probability threshold of potential collisions can be satisfied during the motion planning process. In order to validate the effectiveness of the methodology, different scenarios for multiple agents are investigated, and the simulation results clearly show that the proposed approach can effectively avoid potential collisions with a collision probability less than a specific threshold.
We address the problem of maintaining resource availability in a networked multi-robot team performing distributed tracking of unknown number of targets in an environment of interest. Based on our model, robots are equipped with sensing and computational resources enabling them to cooperatively track a set of targets in an environment using a distributed Probability Hypothesis Density (PHD) filter. We use the trace of a robots sensor measurement noise covariance matrix to quantify its sensing quality. While executing the tracking task, if a robot experiences sensor quality degradation, then robot teams communication network is reconfigured such that the robot with the faulty sensor may share information with other robots to improve the teams target tracking ability without enforcing a large change in the number of active communication links. A central system which monitors the team executes all the network reconfiguration computations. We consider two different PHD fusion methods in this paper and propose four different Mixed Integer Semi-Definite Programming (MISDP) formulations (two formulations for each PHD fusion method) to accomplish our objective. All four MISDP formulations are validated in simulation.
This paper develops an efficient multi-agent deep reinforcement learning algorithm for cooperative controls in powergrids. Specifically, we consider the decentralized inverter-based secondary voltage control problem in distributed generators (DGs), which is first formulated as a cooperative multi-agent reinforcement learning (MARL) problem. We then propose a novel on-policy MARL algorithm, PowerNet, in which each agent (DG) learns a control policy based on (sub-)global reward but local states from its neighboring agents. Motivated by the fact that a local control from one agent has limited impact on agents distant from it, we exploit a novel spatial discount factor to reduce the effect from remote agents, to expedite the training process and improve scalability. Furthermore, a differentiable, learning-based communication protocol is employed to foster the collaborations among neighboring agents. In addition, to mitigate the effects of system uncertainty and random noise introduced during on-policy learning, we utilize an action smoothing factor to stabilize the policy execution. To facilitate training and evaluation, we develop PGSim, an efficient, high-fidelity powergrid simulation platform. Experimental results in two microgrid setups show that the developed PowerNet outperforms a conventional model-based control, as well as several state-of-the-art MARL algorithms. The decentralized learning scheme and high sample efficiency also make it viable to large-scale power grids.
Topology inference is a crucial problem for cooperative control in multi-agent systems. Different from most prior works, this paper is dedicated to inferring the directed network topology from the observations that consist of a single, noisy and finite time-series system trajectory, where the cooperation dynamics is stimulated with the initial network state and the unmeasurable latent input. The unmeasurable latent input refers to intrinsic system signal and extrinsic environment interference. Considering the time-invariant/varying nature of the input, we propose two-layer optimization-based and iterative estimation based topology inference algorithms (TO-TIA and IE-TIA), respectively. TO-TIA allows us to capture the separability of global agent state and eliminates the unknown influence of time-invariant input on system dynamics. IE-TIA further exploits the identifiability and estimability of more general time-varying input and provides an asymptotic solution with desired convergence properties, with higher computation cost compared with TO-TIA. Our novel algorithms relax the dependence of observation scale and leverage the empirical risk reformulation to improve the inference accuracy in terms of the topology structure and edge weight. Comprehensive theoretical analysis and simulations for various topologies are provided to illustrate the inference feasibility and the performance of the proposed algorithms.
Many previous works approach vision-based robotic grasping by training a value network that evaluates grasp proposals. These approaches require an optimization process at run-time to infer the best action from the value network. As a result, the inference time grows exponentially as the dimension of action space increases. We propose an alternative method, by directly training a neural density model to approximate the conditional distribution of successful grasp poses from the input images. We construct a neural network that combines Gaussian mixture and normalizing flows, which is able to represent multi-modal, complex probability distributions. We demonstrate on both simulation and real robot that the proposed actor model achieves similar performance compared to the value network using the Cross-Entropy Method (CEM) for inference, on top-down grasping with a 4 dimensional action space. Our actor model reduces the inference time by 3 times compared to the state-of-the-art CEM method. We believe that actor models will play an important role when scaling up these approaches to higher dimensional action spaces.