ترغب بنشر مسار تعليمي؟ اضغط هنا

The problem of quickest detection of a change in the mean of a sequence of independent observations is studied. The pre-change distribution is assumed to be stationary, while the post-change distributions are allowed to be non-stationary. The case wh ere the pre-change distribution is known is studied first, and then the extension where only the mean and variance of the pre-change distribution are known. No knowledge of the post-change distributions is assumed other than that their means are above some pre-specified threshold larger than the pre-change mean. For the case where the pre-change distribution is known, a test is derived that asymptotically minimizes the worst-case detection delay over all possible post-change distributions, as the false alarm rate goes to zero. Towards deriving this asymptotically optimal test, some new results are provided for the general problem of asymptotic minimax robust quickest change detection in non-stationary settings. Then, the limiting form of the optimal test is studied as the gap between the pre- and post-change means goes to zero, called the Mean-Change Test (MCT). It is shown that the MCT can be designed with only knowledge of the mean and variance of the pre-change distribution. The performance of the MCT is also characterized when the mean gap is moderate, under the additional assumption that the distributions of the observations have bounded support. The analysis is validated through numerical results for detecting a change in the mean of a beta distribution. The use of the MCT in monitoring pandemics is also demonstrated.
We study the problem of quickest detection of a change in the mean of an observation sequence, under the assumption that both the pre- and post-change distributions have bounded support. We first study the case where the pre-change distribution is kn own, and then study the extension where only the mean and variance of the pre-change distribution are known. In both cases, no knowledge of the post-change distribution is assumed other than that it has bounded support. For the case where the pre-change distribution is known, we derive a test that asymptotically minimizes the worst-case detection delay over all post-change distributions, as the false alarm rate goes to zero. We then study the limiting form of the optimal test as the gap between the pre- and post-change means goes to zero, which we call the Mean-Change Test (MCT). We show that the MCT can be designed with only knowledge of the mean and variance of the pre-change distribution. We validate our analysis through numerical results for detecting a change in the mean of a beta distribution. We also demonstrate the use of the MCT for pandemic monitoring.
A stochastic multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. In contrast to prior work, it is assumed that rewards can be non-zero even under collisions, thus allowing for the number of users t o be greater than the number of channels. The proposed algorithm consists of an estimation phase and an allocation phase. It is shown that if every user adopts the algorithm, the system wide regret is order-optimal of order $O(log T)$ over a time-horizon of duration $T$. The regret guarantees hold for both the cases where the number of users is greater than or less than the number of channels. The algorithm is extended to the dynamic case where the number of users in the system evolves over time, and is shown to lead to sub-linear regret.
Multi-user multi-armed bandits have emerged as a good model for uncoordinated spectrum access problems. In this paper we consider the scenario where users cannot communicate with each other. In addition, the environment may appear differently to diff erent users, ${i.e.}$, the mean rewards as observed by different users for the same channel may be different. With this setup, we present a policy that achieves a regret of $O (log{T})$. This paper has been accepted at Asilomar Conference on Signals, Systems, and Computers 2019.
We consider a fully decentralized multi-player stochastic multi-armed bandit setting where the players cannot communicate with each other and can observe only their own actions and rewards. The environment may appear differently to different players, $textit{i.e.}$, the reward distributions for a given arm are heterogeneous across players. In the case of a collision (when more than one player plays the same arm), we allow for the colliding players to receive non-zero rewards. The time-horizon $T$ for which the arms are played is emph{not} known to the players. Within this setup, where the number of players is allowed to be greater than the number of arms, we present a policy that achieves near order-optimal expected regret of order $O(log^{1 + delta} T)$ for some $0 < delta < 1$ over a time-horizon of duration $T$. This paper is currently under review at IEEE Transactions on Information Theory.
In this paper we study the problem of tracking an object moving randomly through a network of wireless sensors. Our objective is to devise strategies for scheduling the sensors to optimize the tradeoff between tracking performance and energy consumpt ion. We cast the scheduling problem as a Partially Observable Markov Decision Process (POMDP), where the control actions correspond to the set of sensors to activate at each time step. Using a bottom-up approach, we consider different sensing, motion and cost models with increasing levels of difficulty. At the first level, the sensing regions of the different sensors do not overlap and the target is only observed within the sensing range of an active sensor. Then, we consider sensors with overlapping sensing range such that the tracking error, and hence the actions of the different sensors, are tightly coupled. Finally, we consider scenarios wherein the target locations and sensors observations assume values on continuous spaces. Exact solutions are generally intractable even for the simplest models due to the dimensionality of the information and action spaces. Hence, we devise approximate solution techniques, and in some cases derive lower bounds on the optimal tradeoff curves. The generated scheduling policies, albeit suboptimal, often provide close-to-optimal energy-tracking tradeoffs.
The capacity regions are investigated for two relay broadcast channels (RBCs), where relay links are incorporated into standard two-user broadcast channels to support user cooperation. In the first channel, the Partially Cooperative Relay Broadcast C hannel, only one user in the system can act as a relay and transmit to the other user through a relay link. An achievable rate region is derived based on the relay using the decode-and-forward scheme. An outer bound on the capacity region is derived and is shown to be tighter than the cut-set bound. For the special case where the Partially Cooperative RBC is degraded, the achievable rate region is shown to be tight and provides the capacity region. Gaussian Partially Cooperative RBCs and Partially Cooperative RBCs with feedback are further studied. In the second channel model being studied in the paper, the Fully Cooperative Relay Broadcast Channel, both users can act as relay nodes and transmit to each other through relay links. This is a more general model than the Partially Cooperative RBC. All the results for Partially Cooperative RBCs are correspondingly generalized to the Fully Cooperative RBCs. It is further shown that the AWGN Fully Cooperative RBC has a larger achievable rate region than the AWGN Partially Cooperative RBC. The results illustrate that relaying and user cooperation are powerful techniques in improving the capacity of broadcast channels.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا