أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Mihaela van der Schaar

Efficient Interference Management Policies for Femtocell Networks

72 - Kartik Ahuja , Yuanzhang Xiao , Mihaela van der Schaar 2015

Managing interference in a network of macrocells underlaid with femtocells presents an important, yet challenging problem. A majority of spatial (frequency/time) reuse based approaches partition the users based on coloring the interference graph, whi ch is shown to be suboptimal. Some spatial time reuse based approaches schedule the maximal independent sets (MISs) in a cyclic, (weighted) round-robin fashion, which is inefficient for delay-sensitive applications. Our proposed policies schedule the MISs in a non-cyclic fashion, which aim to optimize any given network performance criterion for delay-sensitive applications while fulfilling minimum throughput requirements of the users. Importantly, we do not take the interference graph as given as in existing works; we propose an optimal construction of the interference graph. We prove that under certain conditions, the proposed policy achieves the optimal network performance. For large networks, we propose a low-complexity algorithm for computing the proposed policy. We show that the policy computed achieves a constant competitive ratio (with respect to the optimal network performance), which is independent of the network size, under wide range of deployment scenarios. The policy can be implemented in a decentralized manner by the users. Compared to the existing policies, our proposed policies can achieve improvement of up to 130 % in large-scale deployments.

نظرية المعلومات نظرية المعلومات

Global Bandits

40 - Onur Atan , Cem Tekin , Mihaela van der Schaar 2015

Multi-armed bandits (MAB) model sequential decision making problems, in which a learner sequentially chooses arms with unknown reward distributions in order to maximize its cumulative reward. Most of the prior work on MAB assumes that the reward dist ributions of each arm are independent. But in a wide variety of decision problems -- from drug dosage to dynamic pricing -- the expected rewards of different arms are correlated, so that selecting one arm provides information about the expected rewards of other arms as well. We propose and analyze a class of models of such decision problems, which we call {em global bandits}. In the case in which rewards of all arms are deterministic functions of a single unknown parameter, we construct a greedy policy that achieves {em bounded regret}, with a bound that depends on the single true parameter of the problem. Hence, this policy selects suboptimal arms only finitely many times with probability one. For this case we also obtain a bound on regret that is {em independent of the true parameter}; this bound is sub-linear, with an exponent that depends on the informativeness of the arms. We also propose a variant of the greedy policy that achieves $tilde{mathcal{O}}(sqrt{T})$ worst-case and $mathcal{O}(1)$ parameter dependent regret. Finally, we perform experiments on dynamic pricing and show that the proposed algorithms achieve significant gains with respect to the well-known benchmarks.

التعلم الآلي

Distributed Interference Management Policies for Heterogeneous Small Cell Networks

346 - Kartik Ahuja , Yuanzhang Xiao , Mihaela van der Schaar 2014

We study the problem of interference management in large-scale small cell networks, where each user equipment (UE) needs to determine in a distributed manner when and at what power level it should transmit to its serving small cell base station (SBS) such that a given network performance criterion is maximized subject to minimum quality of service (QoS) requirements by the UEs. We first propose a distributed algorithm for the UE-SBS pairs to find a subset of weakly interfering UE-SBS pairs, namely the maximal independent sets (MISs) of the interference graph in logarithmic time (with respect to the number of UEs). Then we propose a novel problem formulation which enables UE-SBS pairs to determine the optimal fractions of time occupied by each MIS in a distributed manner. We analytically bound the performance of our distributed policy in terms of the competitive ratio with respect to the optimal network performance, which is obtained in a centralized manner with NP (non-deterministic polynomial time) complexity. Remarkably, the competitive ratio is independent of the network size, which guarantees scalability in terms of performance for arbitrarily large networks. Through simulations, we show that our proposed policies achieve significant performance improvements (from 150% to 700%) over the existing policies.

نظرية المعلومات نظرية المعلومات

Incentive Design in Peer Review: Rating and Repeated Endogenous Matching

43 - Yuanzhang Xiao , Florian Dorfler , Mihaela van der Schaar 2014

Peer review (e.g., grading assignments in Massive Open Online Courses (MOOCs), academic paper review) is an effective and scalable method to evaluate the products (e.g., assignments, papers) of a large number of agents when the number of dedicated re viewing experts (e.g., teaching assistants, editors) is limited. Peer review poses two key challenges: 1) identifying the reviewers intrinsic capabilities (i.e., adverse selection) and 2) incentivizing the reviewers to exert high effort (i.e., moral hazard). Some works in mechanism design address pure adverse selection using one-shot matching rules, and pure moral hazard was addressed in repeated games with exogenously given and fixed matching rules. However, in peer review systems exhibiting both adverse selection and moral hazard, one-shot or exogenous matching rules do not link agents current behavior with future matches and future payoffs, and as we prove, will induce myopic behavior (i.e., exerting the lowest effort) resulting in the lowest review quality. In this paper, we propose for the first time a solution that simultaneously solves adverse selection and moral hazard. Our solution exploits the repeated interactions of agents, utilizes ratings to summarize agents past review quality, and designs matching rules that endogenously depend on agents ratings. Our proposed matching rules are easy to implement and require no knowledge about agents private information (e.g., their benefit and cost functions). Yet, they are effective in guiding the system to an equilibrium where the agents are incentivized to exert high effort and receive ratings that precisely reflect their review quality. Using several illustrative examples, we quantify the significant performance gains obtained by our proposed mechanism as compared to existing one-shot or exogenous matching rules.

علوم الكمبيوتر ونظرية الألعاب

Foresighted Demand Side Management

93 - Yuanzhang Xiao , Mihaela van der Schaar 2014

We consider a smart grid with an independent system operator (ISO), and distributed aggregators who have energy storage and purchase energy from the ISO to serve its customers. All the entities in the system are foresighted: each aggregator seeks to minimize its own long-term payments for energy purchase and operational costs of energy storage by deciding how much energy to buy from the ISO, and the ISO seeks to minimize the long-term total cost of the system (e.g. energy generation costs and the aggregators costs) by dispatching the energy production among the generators. The decision making of the entities is complicated for two reasons. First, the information is decentralized: the ISO does not know the aggregators states (i.e. their energy consumption requests from customers and the amount of energy in their storage), and each aggregator does not know the other aggregators states or the ISOs state (i.e. the energy generation costs and the status of the transmission lines). Second, the coupling among the aggregators is unknown to them. Specifically, each aggregators energy purchase affects the price, and hence the payments of the other aggregators. However, none of them knows how its decision influences the price because the price is determined by the ISO based on its state. We propose a design framework in which the ISO provides each aggregator with a conjectured future price, and each aggregator distributively minimizes its own long-term cost based on its conjectured price as well as its local information. The proposed framework can achieve the social optimum despite being decentralized and involving complex coupling among the various entities.

أنظمة متعددة العملاء

Optimal Foresighted Multi-User Wireless Video

106 - Yuanzhang Xiao , Mihaela van der Schaar 2013

Recent years have seen an explosion in wireless video communication systems. Optimization in such systems is crucial - but most existing methods intended to optimize the performance of multi-user wireless video transmission are inefficient. Some work s (e.g. Network Utility Maximization (NUM)) are myopic: they choose actions to maximize instantaneous video quality while ignoring the future impact of these actions. Such myopic solutions are known to be inferior to foresighted solutions that optimize the long-term video quality. Alternatively, foresighted solutions such as rate-distortion optimized packet scheduling focus on single-user wireless video transmission, while ignoring the resource allocation among the users. In this paper, we propose an optimal solution for performing joint foresighted resource allocation and packet scheduling among multiple users transmitting video over a shared wireless network. A key challenge in developing foresighted solutions for multiple video users is that the users decisions are coupled. To decouple the users decisions, we adopt a novel dual decomposition approach, which differs from the conventional optimization solutions such as NUM, and determines foresighted policies. Specifically, we propose an informationally-decentralized algorithm in which the network manager updates resource prices (i.e. the dual variables associated with the resource constraints), and the users make individual video packet scheduling decisions based on these prices. Because a priori knowledge of the system dynamics is almost never available at run-time, the proposed solution can learn online, concurrently with performing the foresighted optimization. Simulation results show 7 dB and 3 dB improvements in Peak Signal-to-Noise Ratio (PSNR) over myopic solutions and existing foresighted solutions, respectively.

الوسائط المتعددة

Structural Solutions to Dynamic Scheduling for Multimedia Transmission in Unknown Wireless Environments

121 - Fangwen Fu , , Mihaela van der Schaar 2010

In this paper, we propose a systematic solution to the problem of scheduling delay-sensitive media data for transmission over time-varying wireless channels. We first formulate the dynamic scheduling problem as a Markov decision process (MDP) that ex plicitly considers the users heterogeneous multimedia data characteristics (e.g. delay deadlines, distortion impacts and dependencies etc.) and time-varying channel conditions, which are not simultaneously considered in state-of-the-art packet scheduling algorithms. This formulation allows us to perform foresighted decisions to schedule multiple data units for transmission at each time in order to optimize the long-term utilities of the multimedia applications. The heterogeneity of the media data enables us to express the transmission priorities between the different data units as a priority graph, which is a directed acyclic graph (DAG). This priority graph provides us with an elegant structure to decompose the multi-data unit foresighted decision at each time into multiple single-data unit foresighted decisions which can be performed sequentially, from the high priority data units to the low priority data units, thereby significantly reducing the computation complexity. When the statistical knowledge of the multimedia data characteristics and channel conditions is unknown a priori, we develop a low-complexity online learning algorithm to update the value functions which capture the impact of the current decision on the future utility. The simulation results show that the proposed solution significantly outperforms existing state-of-the-art scheduling solutions.

الوسائط المتعددة التعلم الآلي أنظمة وتحكم

Structure-Aware Stochastic Control for Transmission Scheduling

252 - Fangwen Fu , Mihaela van der Schaar 2010

In this paper, we consider the problem of real-time transmission scheduling over time-varying channels. We first formulate the transmission scheduling problem as a Markov decision process (MDP) and systematically unravel the structural properties (e. g. concavity in the state-value function and monotonicity in the optimal scheduling policy) exhibited by the optimal solutions. We then propose an online learning algorithm which preserves these structural properties and achieves -optimal solutions for an arbitrarily small . The advantages of the proposed online method are that: (i) it does not require a priori knowledge of the traffic arrival and channel statistics and (ii) it adaptively approximates the state-value functions using piece-wise linear functions and has low storage and computation complexity. We also extend the proposed low-complexity online learning solution to the prioritized data transmission. The simulation results demonstrate that the proposed method achieves significantly better utility (or delay)-energy trade-offs when comparing to existing state-of-art online optimization methods.

التعلم الآلي نظرية المعلومات الوسائط المتعددة

Stackelberg Contention Games in Multiuser Networks

91 - Jaeok Park , Mihaela van der Schaar 2009

Interactions among selfish users sharing a common transmission channel can be modeled as a non-cooperative game using the game theory framework. When selfish users choose their transmission probabilities independently without any coordination mechani sm, Nash equilibria usually result in a network collapse. We propose a methodology that transforms the non-cooperative game into a Stackelberg game. Stackelberg equilibria of the Stackelberg game can overcome the deficiency of the Nash equilibria of the original game. A particular type of Stackelberg intervention is constructed to show that any positive payoff profile feasible with independent transmission probabilities can be achieved as a Stackelberg equilibrium payoff profile. We discuss criteria to select an operating point of the network and informational requirements for the Stackelberg game. We relax the requirements and examine the effects of relaxation on performance.

علوم الكمبيوتر ونظرية الألعاب بنية الشبكات والإنترنت

Structural Solutions for Cross-Layer Optimization of Wireless Multimedia Transmission

82 - Fangwen Fu , Mihaela van der Schaar 2009

In this paper, we propose a systematic solution to the problem of cross-layer optimization for delay-sensitive media transmission over time-varying wireless channels as well as investigate the structures and properties of this solution, such that it can be easily implemented in various multimedia systems and applications. Specifically, we formulate this problem as a finite-horizon Markov decision process (MDP) by explicitly considering the users heterogeneous multimedia traffic characteristics (e.g. delay deadlines, distortion impacts and dependencies etc.), time-varying network conditions as well as, importantly, their ability to adapt their cross-layer transmission strategies in response to these dynamics. Based on the heterogeneous characteristics of the media packets, we are able to express the transmission priorities between packets as a new type of directed acyclic graph (DAG). This DAG provides the necessary structure for determining the optimal cross-layer actions in each time slot: the root packet in the DAG will always be selected for transmission since it has the highest positive marginal utility; and the complexity of the proposed cross-layer solution is demonstrated to linearly increase w.r.t. the number of disconnected packet pairs in the DAG and exponentially increase w.r.t. the number of packets on which the current packets depend on. The simulation results demonstrate that the proposed solution significantly outperforms existing state-of-the-art cross-layer solutions. Moreover, we show that our solution provides the upper bound performance for the cross-layer optimization solutions with delayed feedback such as the well-known RaDiO framework.

الوسائط المتعددة نظرية المعلومات نظرية المعلومات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد