ترغب بنشر مسار تعليمي؟ اضغط هنا

Partially Observable Online Change Detection via Smooth-Sparse Decomposition

106   0   0.0 ( 0 )
 نشر من قبل Chen Zhang
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able to adaptively and actively select the most important variables to observe to maximize the detection power. To address these two points, in this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition, whose parameters can be learned via spike-slab variational Bayesian inference. Then the posterior Bayes factor, which incorporates the learned parameters and sparse change information, is formulated as a detection statistic. Finally, by formulating the statistic as the reward of a combinatorial multi-armed bandit problem, an adaptive sampling strategy based on Thompson sampling is proposed. The efficacy and applicability of our method in practice are demonstrated with numerical studies and a real case study.



قيم البحث

اقرأ أيضاً

Finding sparse solutions of underdetermined systems of linear equations is a fundamental problem in signal processing and statistics which has become a subject of interest in recent years. In general, these systems have infinitely many solutions. How ever, it may be shown that sufficiently sparse solutions may be identified uniquely. In other words, the corresponding linear transformation will be invertible if we restrict its domain to sufficiently sparse vectors. This property may be used, for example, to solve the underdetermined Blind Source Separation (BSS) problem, or to find sparse representation of a signal in an `overcomplete dictionary of primitive elements (i.e., the so-called atomic decomposition). The main drawback of current methods of finding sparse solutions is their computational complexity. In this paper, we will show that by detecting `active components of the (potential) solution, i.e., those components having a considerable value, a framework for fast solution of the problem may be devised. The idea leads to a family of algorithms, called `Iterative Detection-Estimation (IDE), which converge to the solution by successive detection and estimation of its active part. Comparing the performance of IDE(s) with one of the most successful method to date, which is based on Linear Programming (LP), an improvement in speed of about two to three orders of magnitude is observed.
Solving Partially Observable Markov Decision Processes (POMDPs) is hard. Learning optimal controllers for POMDPs when the model is unknown is harder. Online learning of optimal controllers for unknown POMDPs, which requires efficient learning using r egret-minimizing algorithms that effectively tradeoff exploration and exploitation, is even harder, and no solution exists currently. In this paper, we consider infinite-horizon average-cost POMDPs with unknown transition model, though a known observation model. We propose a natural posterior sampling-based reinforcement learning algorithm (PSRL-POMDP) and show that it achieves a regret bound of $O(log T)$, where $T$ is the time horizon, when the parameter set is finite. In the general case (continuous parameter set), we show that the algorithm achieves $O (T^{2/3})$ regret under two technical assumptions. To the best of our knowledge, this is the first online RL algorithm for POMDPs and has sub-linear regret.
Information gathering in a partially observable environment can be formulated as a reinforcement learning (RL), problem where the reward depends on the agents uncertainty. For example, the reward can be the negative entropy of the agents belief over an unknown (or hidden) variable. Typically, the rewards of an RL agent are defined as a function of the state-action pairs and not as a function of the belief of the agent; this hinders the direct application of deep RL methods for such tasks. This paper tackles the challenge of using belief-based rewards for a deep RL agent, by offering a simple insight that maximizing any convex function of the belief of the agent can be approximated by instead maximizing a prediction reward: a reward based on prediction accuracy. In particular, we derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards---namely visual attention, question answering systems, and intrinsic motivation---and highlights their connection to the usually distinct fields of active perception, active sensing, and sensor placement. Based on this insight we present deep anticipatory networks (DANs), which enables an agent to take actions to reduce its uncertainty without performing explicit belief inference. We present two applications of DANs: building a sensor selection system for tracking people in a shopping mall and learning discrete models of attention on fashion MNIST and MNIST digit classification.
We study the problem of learning a Nash equilibrium (NE) in an imperfect information game (IIG) through self-play. Precisely, we focus on two-player, zero-sum, episodic, tabular IIG under the perfect-recall assumption where the only feedback is reali zations of the game (bandit feedback). In particular, the dynamic of the IIG is not known -- we can only access it by sampling or interacting with a game simulator. For this learning setting, we provide the Implicit Exploration Online Mirror Descent (IXOMD) algorithm. It is a model-free algorithm with a high-probability bound on the convergence rate to the NE of order $1/sqrt{T}$ where $T$ is the number of played games. Moreover, IXOMD is computationally efficient as it needs to perform the updates only along the sampled trajectory.
Since the early 1900s, numerous research efforts have been devoted to developing quantitative solutions to stochastic mechanical systems. In general, the problem is perceived as solved when a complete or partial probabilistic description on the quant ity of interest (QoI) is determined. However, in the presence of complex system behavior, there is a critical need to go beyond mere probabilistic descriptions. In fact, to gain a full understanding of the system, it is crucial to extract physical characterizations from the probabilistic structure of the QoI, especially when the QoI solution is obtained in a data-driven fashion. Motivated by this perspective, the paper proposes a framework to obtain structuralized characterizations on behaviors of stochastic systems. The framework is named Probabilistic Performance-Pattern Decomposition (PPPD). PPPD analysis aims to decompose complex response behaviors, conditional to a prescribed performance state, into meaningful patterns in the space of system responses, and to investigate how the patterns are triggered in the space of basic random variables. To illustrate the application of PPPD, the paper studies three numerical examples: 1) an illustrative example with hypothetical stochastic processes input and output; 2) a stochastic Lorenz system with periodic as well as chaotic behaviors; and 3) a simplified shear-building model subjected to a stochastic ground motion excitation.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا