بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Asymptotic Randomised Control with applications to bandits

101 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tanut Treetanthiploet

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Samuel N. Cohen - Tanut Treetanthiploet

التحسين والتحكم التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider a general multi-armed bandit problem with correlated (and simple contextual and restless) elements, as a relaxed control problem. By introducing an entropy premium, we obtain a smooth asymptotic approximation to the value function. This yields a novel semi-index approximation of the optimal decision process, obtained numerically by solving a fixed point problem, which can be interpreted as explicitly balancing an exploration-exploitation trade-off. Performance of the resulting Asymptotic Randomised Control (ARC) algorithm compares favourably with other approaches to correlated multi-armed bandits.

قيم البحث

369 - Martin Andreasson , Dimos V. Dimarogonas , Henrik Sandberg 2013

This paper considers a distributed PI-controller for networked dynamical systems. Sufficient conditions for when the controller is able to stabilize a general linear system and eliminate static control errors are presented. The proposed controller is applied to frequency control of power transmission systems. Sufficient stability criteria are derived, and it is shown that the controller parameters can always be chosen so that the frequencies in the closed loop converge to nominal operational frequency. We show that the load sharing property of the generators is maintained, i.e., the input power of the generators is proportional to a controller parameter. The controller is evaluated by simulation on the IEEE 30 bus test network, where its effectiveness is demonstrated.

التحسين والتحكم

Performance-Barrier-Based Event-Triggered Control with Applications to Network Systems

143 - Pio Ong , Jorge Cortes 2021

This paper proposes a novel framework for resource-aware control design termed performance-barrier-based triggering. Given a feedback policy, along with a Lyapunov function certificate that guarantees its correctness, we examine the problem of design ing its digital implementation through event-triggered control while ensuring a prescribed performance is met and triggers occur as sparingly as possible. Our methodology takes into account the performance residual, i.e., how well the system is doing in regards to the prescribed performance. Inspired by the notion of control barrier function, the trigger design allows the certificate to deviate from monotonically decreasing, with leeway specified as an increasing function of the performance residual, resulting in greater flexibility in prescribing update times. We study different types of performance specifications, with particular attention to quantifying the benefits of the proposed approach in the exponential case. We build on this to design intrinsically Zeno-free distributed triggers for network systems. A comparison of event-triggered approaches in a vehicle platooning problem shows how the proposed design meets the prescribed performance with a significantly lower number of controller updates.

التحسين والتحكم أنظمة وتحكم أنظمة وتحكم

From local to global asymptotic stabilizability for weakly contractive control systems

55 - Vincent Andrieu 2020

A nonlinear control system is said to be weakly contractive in the control if the flow that it generates is non-expanding (in the sense that the distance between two trajectories is a non-increasing function of time) for some fixed Riemannian metric independent of the control. We prove in this paper that for such systems, local asymptotic stabilizability implies global asymptotic stabilizability by means of a dynamic state feedback. We link this result and the so-called Jurdjevic and Quinn approach.

التحسين والتحكم

Asymptotic behaviour of randomised fractional volatility models

62 - B. Horvath , A. Jacquier , C. Lacombe 2017

We study the asymptotic behaviour of a class of small-noise diffusions driven by fractional Brownian motion, with random starting points. Different scalings allow for different asymptotic properties of the process (small-time and tail behaviours in p articular). In order to do so, we extend some results on sample path large deviations for such diffusions. As an application, we show how these results characterise the small-time and tail estimates of the implied volatility for rough volatility models, recently proposed in mathematical finance.

الاحتمالات

Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function

61 - Dong-Young Lim , Ariel Neufeld , Sotirios Sabanis 2021

We consider non-convex stochastic optimization problems where the objective functions have super-linearly growing and discontinuous stochastic gradients. In such a setting, we provide a non-asymptotic analysis for the tamed unadjusted stochastic Lang evin algorithm (TUSLA) introduced in Lovas et al. (2021). In particular, we establish non-asymptotic error bounds for the TUSLA algorithm in Wasserstein-1 and Wasserstein-2 distances. The latter result enables us to further derive non-asymptotic estimates for the expected excess risk. To illustrate the applicability of the main results, we consider an example from transfer learning with ReLU neural networks, which represents a key paradigm in machine learning. Numerical experiments are presented for the aforementioned example which supports our theoretical findings. Hence, in this setting, we demonstrate both theoretically and numerically that the TUSLA algorithm can solve the optimization problem involving neural networks with ReLU activation function. Besides, we provide simulation results for synthetic examples where popular algorithms, e.g. ADAM, AMSGrad, RMSProp, and (vanilla) SGD, may fail to find the minimizer of the objective functions due to the super-linear growth and the discontinuity of the corresponding stochastic gradient, while the TUSLA algorithm converges rapidly to the optimal solution.

التحسين والتحكم التعلم الآلي التحليل العددي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة المستنصرية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Asymptotic Randomised Control with applications to bandits

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً