We consider the optimal coverage problem where a multi-agent network is deployed in an environment with obstacles to maximize a joint event detection probability. The objective function of this problem is non-convex and no global optimum is guaranteed by gradient-based algorithms developed to date. We first show that the objective function is monotone submodular, a class of functions for which a simple greedy algorithm is known to be within 0.63 of the optimal solution. We then derive two tighter lower bounds by exploiting the curvature information (total curvature and elemental curvature) of the objective function. We further show that the tightness of these lower bounds is complementary with respect to the sensing capabilities of the agents. The greedy algorithm solution can be subsequently used as an initial point for a gradient-based algorithm to obtain solutions even closer to the global optimum. Simulation results show that this approach leads to significantly better performance relative to previously used algorithms.
The problem of controlling multi-agent systems under different models of information sharing among agents has received significant attention in the recent literature. In this paper, we consider a setup where rather than committing to a fixed information sharing protocol (e.g. periodic sharing or no sharing etc), agents can dynamically decide at each time step whether to share information with each other and incur the resulting communication cost. This setup requires a joint design of agents communication and control strategies in order to optimize the trade-off between communication costs and control objective. We first show that agents can ignore a big part of their private information without compromising the system performance. We then provide a common information approach based solution for the strategy optimization problem. This approach relies on constructing a fictitious POMDP whose solution (obtained via a dynamic program) characterizes the optimal strategies for the agents. We also show that our solution can be easily modified to incorporate constraints on when and how frequently agents can communicate.
We propose a neural network approach for solving high-dimensional optimal control problems. In particular, we focus on multi-agent control problems with obstacle and collision avoidance. These problems immediately become high-dimensional, even for moderate phase-space dimensions per agent. Our approach fuses the Pontryagin Maximum Principle and Hamilton-Jacobi-Bellman (HJB) approaches and parameterizes the value function with a neural network. Our approach yields controls in a feedback form for quick calculation and robustness to moderate disturbances to the system. We train our model using the objective function and optimality conditions of the control problem. Therefore, our training algorithm neither involves a data generation phase nor solutions from another algorithm. Our model uses empirically effective HJB penalizers for efficient training. By training on a distribution of initial states, we ensure the controls optimality is achieved on a large portion of the state-space. Our approach is grid-free and scales efficiently to dimensions where grids become impractical or infeasible. We demonstrate our approachs effectiveness on a 150-dimensional multi-agent problem with obstacles.
This paper addresses tracking of a moving target in a multi-agent network. The target follows a linear dynamics corrupted by an adversarial noise, i.e., the noise is not generated from a statistical distribution. The location of the target at each time induces a global time-varying loss function, and the global loss is a sum of local losses, each of which is associated to one agent. Agents noisy observations could be nonlinear. We formulate this problem as a distributed online optimization where agents communicate with each other to track the minimizer of the global loss. We then propose a decentralized version of the Mirror Descent algorithm and provide the non-asymptotic analysis of the problem. Using the notion of dynamic regret, we measure the performance of our algorithm versus its offline counterpart in the centralized setting. We prove that the bound on dynamic regret scales inversely in the network spectral gap, and it represents the adversarial noise causing deviation with respect to the linear dynamics. Our result subsumes a number of results in the distributed optimization literature. Finally, in a numerical experiment, we verify that our algorithm can be simply implemented for multi-agent tracking with nonlinear observations.
In this paper, we investigate a constrained optimal coordination problem for a class of heterogeneous nonlinear multi-agent systems described by high-order dynamics subject to both unknown nonlinearities and external disturbances. Each agent has a private objective function and a constraint about its output. A neural network-based distributed controller is developed for each agent such that all agent outputs can reach the constrained minimal point of the aggregate objective function with bounded residual errors. Two examples are finally given to demonstrate the effectiveness of the algorithm.
We address the problem of multiple local optima commonly arising in optimization problems for multi-agent systems, where objective functions are nonlinear and nonconvex. For the class of coverage control problems, we propose a systematic approach for escaping a local optimum, rather than randomly perturbing controllable variables away from it. We show that the objective function for these problems can be decomposed to facilitate the evaluation of the local partial derivative of each node in the system and to provide insights into its structure. This structure is exploited by defining boosting functions applied to the aforementioned local partial derivative at an equilibrium point where its value is zero so as to transform it in a way that induces nodes to explore poorly covered areas of the mission space until a new equilibrium point is reached. The proposed boosting process ensures that, at its conclusion, the objective function is no worse than its pre-boosting value. However, the global optima cannot be guaranteed. We define three families of boosting functions with different properties and provide simulation results illustrating how this approach improves the solutions obtained for this class of distributed optimization problems.