No Arabic abstract
This paper proposes a data-driven control framework to regulate an unknown, stochastic linear dynamical system to the solution of a (stochastic) convex optimization problem. Despite the centrality of this problem, most of the available methods critically rely on a precise knowledge of the system dynamics (thus requiring off-line system identification and model refinement). To this aim, in this paper we first show that the steady-state transfer function of a linear system can be computed directly from control experiments, bypassing explicit model identification. Then, we leverage the estimated transfer function to design a controller -- which is inspired by stochastic gradient descent methods -- that regulates the system to the solution of the prescribed optimization problem. A distinguishing feature of our methods is that they do not require any knowledge of the system dynamics, disturbance terms, or their distributions. Our technical analysis combines concepts and tools from behavioral system theory, stochastic optimization with decision-dependent distributions, and stability analysis. We illustrate the applicability of the framework on a case study for mobility-on-demand ride service scheduling in Manhattan, NY.
This paper proposes a data-driven framework to solve time-varying optimization problems associated with unknown linear dynamical systems. Making online control decisions to regulate a dynamical system to the solution of an optimization problem is a central goal in many modern engineering applications. Yet, the available methods critically rely on a precise knowledge of the system dynamics, thus mandating a preliminary system identification phase before a controller can be designed. In this work, we leverage results from behavioral theory to show that the steady-state transfer function of a linear system can be computed from data samples without any knowledge or estimation of the system model. We then use this data-driven representation to design a controller, inspired by a gradient-descent optimization method, that regulates the system to the solution of a convex optimization problem, without requiring any knowledge of the time-varying disturbances affecting the model equation. Results are tailored to cost functions satisfy the Polyak-L ojasiewicz inequality.
We consider optimization problems for (networked) systems, where we minimize a cost that includes a known time-varying function associated with the systems outputs and an unknown function of the inputs. We focus on a data-based online projected gradient algorithm where: i) the input-output map of the system is replaced by measurements of the output whenever available (thus leading to a closed-loop setup); and ii) the unknown function is learned based on functional evaluations that may occur infrequently. Accordingly, the feedback-based online algorithm operates in a regime with inexact gradient knowledge and with random updates. We show that the online algorithm generates points that are within a bounded error from the optimal solution of the problem; in particular, we provide error bounds in expectation and in high-probability, where the latter is given when the gradient error follows a sub-Weibull distribution and when missing measurements are modeled as Bernoulli random variables. We also provide results in terms of input-to-state stability in expectation and in probability. Numerical results are presented in the context of a demand response task in power systems.
Preference-based global optimization algorithms minimize an unknown objective function only based on whether the function is better, worse, or similar for given pairs of candidate optimization vectors. Such optimization problems arise in many real-life examples, such as finding the optimal calibration of the parameters of a control law. The calibrator can judge whether a particular combination of parameters leads to a better, worse, or similar closed-loop performance. Often, the search for the optimal parameters is also subject to unknown constraints. For example, the vector of calibration parameters must not lead to closed-loop instability. This paper extends an active preference learning algorithm introduced recently by the authors to handle unknown constraints. The proposed method, called C-GLISp, looks for an optimizer of the problem only based on preferences expressed on pairs of candidate vectors, and on whether a given vector is reported feasible and/or satisfactory. C-GLISp learns a surrogate of the underlying objective function based on the expressed preferences, and a surrogate of the probability that a sample is feasible and/or satisfactory based on whether each of the tested vectors was judged as such. The surrogate functions are used to propose a new candidate vector for testing and assessment iteratively. Numerical benchmarks and a semi-automated control calibration task demonstrate the effectiveness of C-GLISp, showing that it can reach near-optimal solutions within a small number of iterations.
Stochastic model predictive control (SMPC) has been a promising solution to complex control problems under uncertain disturbances. However, traditional SMPC approaches either require exact knowledge of probabilistic distributions, or rely on massive scenarios that are generated to represent uncertainties. In this paper, a novel scenario-based SMPC approach is proposed by actively learning a data-driven uncertainty set from available data with machine learning techniques. A systematical procedure is then proposed to further calibrate the uncertainty set, which gives appropriate probabilistic guarantee. The resulting data-driven uncertainty set is more compact than traditional norm-based sets, and can help reducing conservatism of control actions. Meanwhile, the proposed method requires less data samples than traditional scenario-based SMPC approaches, thereby enhancing the practicability of SMPC. Finally the optimal control problem is cast as a single-stage robust optimization problem, which can be solved efficiently by deriving the robust counterpart problem. The feasibility and stability issue is also discussed in detail. The efficacy of the proposed approach is demonstrated through a two-mass-spring system and a building energy control problem under uncertain disturbances.
We study safe, data-driven control of (Markov) jump linear systems with unknown transition probabilities, where both the discrete mode and the continuous state are to be inferred from output measurements. To this end, we develop a receding horizon estimator which uniquely identifies a sub-sequence of past mode transitions and the corresponding continuous state, allowing for arbitrary switching behavior. Unlike traditional approaches to mode estimation, we do not require an offline exhaustive search over mode sequences to determine the size of the observation window, but rather select it online. If the system is weakly mode observable, the window size will be upper bounded, leading to a finite-memory observer. We integrate the estimation procedure with a simple distributionally robust controller, which hedges against misestimations of the transition probabilities due to finite sample sizes. As additional mode transitions are observed, the used ambiguity sets are updated, resulting in continual improvements of the control performance. The practical applicability of the approach is illustrated on small numerical examples.