A PDE Approach to the Prediction of a Binary Sequence with Advice from Two History-Dependent Experts

153 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Nadejda Drenska

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Nadejda Drenska - Robert V. Kohn

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The prediction of a binary sequence is a classic example of online machine learning. We like to call it the stock prediction problem, viewing the sequence as the price history of a stock that goes up or down one unit at each time step. In this problem, an investor has access to the predictions of two or more experts, and strives to minimize her final-time regret with respect to the best-performing expert. Probability plays no role; rather, the market is assumed to be adversarial. We consider the case when there are two history-dependent experts, whose predictions are determined by the d most recent stock moves. Focusing on an appropriate continuum limit and using methods from optimal control, graph theory, and partial differential equations, we discuss strategies for the investor and the adversarial market, and we determine associated upper and lower bounds for the investors final-time regret. When d is less than 4 our upper and lower bounds coalesce, so the proposed strategies are asymptotically optimal. Compared to other recent applications of partial differential equations to prediction, ours has a new element: there are two timescales, since the recent history changes at every step whereas regret accumulates more slowly.

قيم البحث

58 - Nadejda Drenska , Robert V. Kohn 2019

This work addresses a classic problem of online prediction with expert advice. We assume an adversarial opponent, and we consider both the finite-horizon and random-stoppi

تحليل PDES علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

New Potential-Based Bounds for Prediction with Expert Advice

77 - Vladimir A. Kobzar , Robert V. Kohn , Zhilei Wang 2019

This work addresses the classic machine learning problem of online prediction with expert advice. We consider the finite-horizon version of this zero-sum, two-person game. Using verification arguments from optimal control theory, we view the task of finding better lower and upper bounds on the value of the game (regret) as the problem of finding better sub- and supersolutions of certain partial differential equations (PDEs). These sub- and supersolutions serve as the potentials for player and adversary strategies, which lead to the corresponding bounds. To get explicit bounds, we use closed-form solutions of specific PDEs. Our bounds hold for any given number of experts and horizon; in certain regimes (which we identify) they improve upon the previous state of the art. For two and three experts, our bounds provide the optimal leading order term.

التعلم الآلي علوم الكمبيوتر ونظرية الألعاب تحليل PDES

New Potential-Based Bounds for the Geometric-Stopping Version of Prediction with Expert Advice

63 - Vladimir A. Kobzar , Robert V. Kohn , Zhilei Wang 2019

This work addresses the classic machine learning problem of online prediction with expert advice. A new potential-based framework for the fixed horizon version of this problem has been recently developed using verification arguments from optimal cont rol theory. This paper extends this framework to the random (geometric) stopping version. To obtain explicit bounds, we construct potentials for the geometric version from potentials used for the fixed horizon version of the problem. This construction leads to new explicit lower and upper bounds associated with specific adversary and player strategies. While there are several known lower bounds in the fixed horizon setting, our lower bounds appear to be the first such results in the geometric stopping setting with an arbitrary number of experts. Our framework also leads in some cases to improved upper bounds. For two and three experts, our bounds are optimal to leading order.

التعلم الآلي علوم الكمبيوتر ونظرية الألعاب تحليل PDES

Mean Field Games with Mean-Field-Dependent Volatility: A PDE Approach

196 - Ziyu Huang , Shanjian Tang 2021

In this paper, we develop a PDE approach to consider the optimal strategy of mean field controlled stochastic system. Firstly, we discuss mean field SDEs and associated Fokker-Plank eqautions. Secondly, we consider a fully-coupled system of forward-b ackward PDEs. The backward one is the Hamilton-Jacobi-Bellman equation while the forward one is the Fokker-Planck equation. Our main result is to show the existence of classical solutions of the forward-backward PDEs in the class $H^{1+frac{1}{4},2+frac{1}{2}}([0,T]timesmathbb{R}^n)$ by use of the Schauder fixed point theorem. Then, we use the solution to give the optimal strategy of the mean field stochastic control problem. Finally, we give an example to illustrate the role of our main result.

الاحتمالات التحسين والتحكم

A Primal Condition for Approachability with Partial Monitoring

170 - Shie Mannor , Gilles Stoltzn (INRIA Paris - Rocquencourt 2013

In approachability with full monitoring there are two types of conditions that are known to be equivalent for convex sets: a primal and a dual condition. The primal one is of the form: a set C is approachable if and only all containing half-spaces ar e approachable in the one-shot game; while the dual one is of the form: a convex set C is approachable if and only if it intersects all payoff sets of a certain form. We consider approachability in games with partial monitoring. In previous works (Perchet 2011; Mannor et al. 2011) we provided a dual characterization of approachable convex sets; we also exhibited efficient strategies in the case where C is a polytope. In this paper we provide primal conditions on a convex set to be approachable with partial monitoring. They depend on a modified reward function and lead to approachability strategies, based on modified payoff functions, that proceed by projections similarly to Blackwells (1956) strategy; this is in contrast with previously studied strategies in this context that relied mostly on the signaling structure and aimed at estimating well the distributions of the signals received. Our results generalize classical results by Kohlberg 1975 (see also Mertens et al. 1994) and apply to games with arbitrary signaling structure as well as to arbitrary convex sets.

التحسين والتحكم علوم الكمبيوتر ونظرية الألعاب التعلم الآلي