ترغب بنشر مسار تعليمي؟ اضغط هنا

Identity Concealment Games: How I Learned to Stop Revealing and Love the Coincidences

63   0   0.0 ( 0 )
 نشر من قبل Mustafa O. Karabag
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In an adversarial environment, a hostile player performing a task may behave like a non-hostile one in order not to reveal its identity to an opponent. To model such a scenario, we define identity concealment games: zero-sum stochastic reachability games with a zero-sum objective of identity concealment. To measure the identity concealment of the player, we introduce the notion of an average player. The average players policy represents the expected behavior of a non-hostile player. We show that there exists an equilibrium policy pair for every identity concealment game and give the optimality equations to synthesize an equilibrium policy pair. If the players opponent follows a non-equilibrium policy, the player can hide its identity better. For this reason, we study how the hostile player may learn the opponents policy. Since learning via exploration policies would quickly reveal the hostile players identity to the opponent, we consider the problem of learning a near-optimal policy for the hostile player using the game runs collected under the average players policy. Consequently, we propose an algorithm that provably learns a near-optimal policy and give an upper bound on the number of sample runs to be collected.

قيم البحث

اقرأ أيضاً

Cost-based query optimizers remain one of the most important components of database management systems for analytic workloads. Though modern optimizers select plans close to optimal performance in the common case, a small number of queries are an ord er of magnitude slower than they could be. In this paper we investigate why this is still the case, despite decades of improvements to cost models, plan enumeration, and cardinality estimation. We demonstrate why we believe that a re-optimization mechanism is likely the most cost-effective way to improve end-to-end query performance. We find that even a simple re-optimization scheme can improve the latency of many poorly performing queries. We demonstrate that re-optimization improves the end-to-end latency of the top 20 longest running queries in the Join Order Benchmark by 27%, realizing most of the benefit of perfect cardinality estimation.
This paper is a `spiritual child of the 2005 lecture notes Kindergarten Quantum Mechanics, which showed how a simple, pictorial extension of Dirac notation allowed several quantum features to be easily expressed and derived, using language even a kin dergartner can understand. Central to that approach was the use of pictures and pictorial transformation rules to understand and derive features of quantum theory and computation. However, this approach left many wondering `wheres the beef? In other words, was this new approach capable of producing new results, or was it simply an aesthetically pleasing way to restate stuff we already know? The aim of this sequel paper is to say `heres the beef!, and highlight some of the major results of the approach advocated in Kindergarten Quantum Mechanics, and how they are being applied to tackle practical problems on real quantum computers. We will focus mainly on what has become the Swiss army knife of the pictorial formalism: the ZX-calculus. First we look at some of the ideas behind the ZX-calculus, comparing and contrasting it with the usual quantum circuit formalism. We then survey results from the past 2 years falling into three categories: (1) completeness of the rules of the ZX-calculus, (2) state-of-the-art quantum circuit optimisation results in commercial and open-source quantum compilers relying on ZX, and (3) the use of ZX in translating real-world stuff like natural language into quantum circuits that can be run on todays (very limited) quantum hardware. We also take the title literally, and outline an ongoing experiment aiming to show that ZX-calculus enables children to do cutting-edge quantum computing stuff. If anything, this would truly confirm that `kindergarten quantum mechanics wasnt just a joke.
Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learn ing which have been applied to complex games such as Go or Poker. In practice, a emph{single} pair of networks is typically trained in order to find an approximate equilibrium of a highly nonconcave-nonconvex adversarial problem. However, while a classic result in game theory states such an equilibrium exists in concave-convex games, there is no analogous guarantee if the payoff is nonconcave-nonconvex. Our main contribution is to provide an approximate minimax theorem for a large class of games where the players pick neural networks including WGAN, StarCraft II, and Blotto Game. Our findings rely on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, these games are concave-convex with respect to the actual models (e.g., functions or distributions) represented by these neural networks.
Star-formation rates (SFR) of disk galaxies strongly correlate with stellar mass, with a small dispersion in SSFR at fixed mass, sigma~0.3 dex. With such small scatter this star-formation main sequence (SFMS) has been interpreted as deterministic and fundamental. Here we demonstrate that it is a simple consequence of the central limit theorem. Our derivation begins by approximating in situ stellar mass growth as a stochastic process, much like a random walk (where the expectation of SFR at any time is equal to the SFR at the previous time). We then derive expectation values for median SSFR of star-forming disks and their scatter over time. We generalize the results for stochastic changes in SFR that are not independent of each other but are correlated over time. For unbiased samples of (disk) galaxies, we derive an expectation that <SSFR> should be independent of mass, decline as 1/T, and have a relative scatter that is independent of mass and time. The derived SFMS and its evolution matches published data to z=10 with sufficient accuracy to constrain cosmological parameters. The framework reproduces several important observables, including: the scatter in SSFR at fixed mass; the SFHs of nearby dwarf galaxies and the Milky Way; and the scatter in the Tully-Fisher relation. The evolution of the mass function is less well reproduced and we discuss ways to generalize the framework to include other sources of stellar mass such as mergers. The predicted dispersion in SSFR has consequences for the classification of quiescent galaxies, as such galaxies have heterogeneous formation histories, and many may only be temporarily diminished in their star-formation activity. The implied dispersion in SFHs, and the SFMSs insensitivity to timescales of stochasticity, thus substantially limits the ability to connect massive galaxies to their progenitors over long cosmic baselines. [TRUNC.]
We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic risk from both stochastic state transitions (inherent to the game) and ra ndomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutanis fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا