No Arabic abstract
We propose automated techniques for the verification and control of probabilistic real-time systems that are only partially observable. To formally model such systems, we define an extension of probabilistic timed automata in which local states are partially visible to an observer or controller. We give a probabilistic temporal logic that can express a range of quantitative properties of these models, relating to the probability of an events occurrence or the expected value of a reward measure. We then propose techniques to either verify that such a property holds or to synthesise a controller for the model which makes it true. Our approach is based on an integer discretisation of the models dense-time behaviour and a grid-based abstraction of the uncountable belief space induced by partial observability. The latter is necessarily approximate since the underlying problem is undecidable, however we show how both lower and upper bounds on numerical results can be generated. We illustrate the effectiveness of the approach by implementing it in the PRISM model checker and applying it to several case studies, from the domains of computer security and task scheduling.
We study safe, data-driven control of (Markov) jump linear systems with unknown transition probabilities, where both the discrete mode and the continuous state are to be inferred from output measurements. To this end, we develop a receding horizon estimator which uniquely identifies a sub-sequence of past mode transitions and the corresponding continuous state, allowing for arbitrary switching behavior. Unlike traditional approaches to mode estimation, we do not require an offline exhaustive search over mode sequences to determine the size of the observation window, but rather select it online. If the system is weakly mode observable, the window size will be upper bounded, leading to a finite-memory observer. We integrate the estimation procedure with a simple distributionally robust controller, which hedges against misestimations of the transition probabilities due to finite sample sizes. As additional mode transitions are observed, the used ambiguity sets are updated, resulting in continual improvements of the control performance. The practical applicability of the approach is illustrated on small numerical examples.
We present a faster symbolic algorithm for the following central problem in probabilistic verification: Compute the maximal end-component (MEC) decomposition of Markov decision processes (MDPs). This problem generalizes the SCC decomposition problem of graphs and closed recurrent sets of Markov chains. The model of symbolic algorithms is widely used in formal verification and model-checking, where access to the input model is restricted to only symbolic operations (e.g., basic set operations and computation of one-step neighborhood). For an input MDP with $n$ vertices and $m$ edges, the classical symbolic algorithm from the 1990s for the MEC decomposition requires $O(n^2)$ symbolic operations and $O(1)$ symbolic space. The only other symbolic algorithm for the MEC decomposition requires $O(n sqrt{m})$ symbolic operations and $O(sqrt{m})$ symbolic space. A main open question is whether the worst-case $O(n^2)$ bound for symbolic operations can be beaten. We present a symbolic algorithm that requires $widetilde{O}(n^{1.5})$ symbolic operations and $widetilde{O}(sqrt{n})$ symbolic space. Moreover, the parametrization of our algorithm provides a trade-off between symbolic operations and symbolic space: for all $0<epsilon leq 1/2$ the symbolic algorithm requires $widetilde{O}(n^{2-epsilon})$ symbolic operations and $widetilde{O}(n^{epsilon})$ symbolic space ($widetilde{O}$ hides poly-logarithmic factors). Using our techniques we present faster algorithms for computing the almost-sure winning regions of $omega$-regular objectives for MDPs. We consider the canonical parity objectives for $omega$-regular objectives, and for parity objectives with $d$-priorities we present an algorithm that computes the almost-sure winning region with $widetilde{O}(n^{2-epsilon})$ symbolic operations and $widetilde{O}(n^{epsilon})$ symbolic space, for all $0 < epsilon leq 1/2$.
Time-Sensitive Distributed Systems (TSDS), such as applications using autonomous drones, achieve goals under possible environment interference (eg, winds). Moreover, goals are often specified using explicit time constraints which must be satisfied by the system emph{perpetually}. For example, drones carrying out the surveillance of some area must always have emph{recent pictures}, ie, at most $M$ time units old, of some strategic locations. This paper proposes a Multiset Rewriting language with explicit time for specifying and analysing TSDSes. We introduce two properties, emph{realizability} (some trace is good) and emph{survivability} (where, in addition, all admissible traces are good). A good trace is an infinite trace in which goals are perpetually satisfied. We propose a class of systems called emph{progressive timed systems} (PTS), where intuitively only a finite number of actions can be carried out in a bounded time period. We prove that for this class of systems both the realizability and the survivability problems are PSPACE-complete. Furthermore, if we impose a bound on time (as in bounded model-checking), we show that for PTS, realizability becomes NP-complete, while survivability is in the $Delta_2^p$ class of the polynomial hierarchy. Finally, we demonstrate that the rewriting logic system Maude can be used to automate time bounded verification of PTS.
Safety-critical distributed cyber-physical systems (CPSs) have been found in a wide range of applications. Notably, they have displayed a great deal of utility in intelligent transportation, where autonomous vehicles communicate and cooperate with each other via a high-speed communication network. Such systems require an ability to identify maneuvers in real-time that cause dangerous circumstances and ensure the implementation always meets safety-critical requirements. In this paper, we propose a real-time decentralized reachability approach for safety verification of a distributed multi-agent CPS with the underlying assumption that all agents are time-synchronized with a low degree of error. In the proposed approach, each agent periodically computes its local reachable set and exchanges this reachable set with the other agents with the goal of verifying the system safety. Our method, implemented in Java, takes advantages of the timing information and the reachable set information that are available in the exchanged messages to reason about the safety of the whole system in a decentralized manner. Any particular agent can also perform local safety verification tasks based on their local clocks by analyzing the messages it receives. We applied the proposed method to verify, in real-time, the safety properties of a group of quadcopters performing a distributed search mission.
The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin- 1/2 system and a lamda-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.