بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Dynamic Programming and Linear Programming for Odds Problem

209 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tomomi Matsui

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Sachika Kurokawa - Tomomi Matsui

التحسين والتحكم تطبيقات الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper discusses the odds problem, proposed by Bruss in 2000, and its variants. A recurrence relation called a dynamic programming (DP) equation is used to find an optimal stopping policy of the odds problem and its variants. In 2013, Buchbinder, Jain, and Singh proposed a linear programming (LP) formulation for finding an optimal stopping policy of the classical secretary problem, which is a special case of the odds problem. The proposed linear programming problem, which maximizes the probability of a win, differs from the DP equations known for long time periods. This paper shows that an ordinary DP equation is a modification of the dual problem of linear programming including the LP formulation proposed by Buchbinder, Jain, and Singh.

قيم البحث

352 - V. V. Desai , V. F. Farias , C. C. Moallemi 2009

We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural `projection of a well studied l inear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program--the `smoothed approximate linear program--is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate substantially superior bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. Second, experiments with our approach on a challenging problem (the game of Tetris) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by an order of magnitude.

التحسين والتحكم

Dynamic Programming for General Linear Quadratic Optimal Stochastic Control with Random Coefficients

376 - Shanjian Tang 2014

We are concerned with the linear-quadratic optimal stochastic control problem with random coefficients. Under suitable conditions, we prove that the value field $V(t,x,omega), (t,x,omega)in [0,T]times R^ntimes Omega$, is quadratic in $x$, and has the following form: $V(t,x)=langle K_tx, xrangle$ where $K$ is an essentially bounded nonnegative symmetric matrix-valued adapted processes. Using the dynamic programming principle (DPP), we prove that $K$ is a continuous semi-martingale of the form $$K_t=K_0+int_0^t , dk_s+sum_{i=1}^dint_0^tL_s^i, dW_s^i, quad tin [0,T]$$ with $k$ being a continuous process of bounded variation and $$Eleft[left(int_0^T|L_s|^2, dsright)^pright] <infty, quad forall pge 2; $$ and that $(K, L)$ with $L:=(L^1, cdots, L^d)$ is a solution to the associated backward stochastic Riccati equation (BSRE), whose generator is highly nonlinear in the unknown pair of processes. The uniqueness is also proved via a localized completion of squares in a self-contained manner for a general BSRE. The existence and uniqueness of adapted solution to a general BSRE was initially proposed by the French mathematician J. M. Bismut (1976, 1978). It had been solved by the author (2003) via the stochastic maximum principle with a viewpoint of stochastic flow for the associated stochastic Hamiltonian system. The present paper is its companion, and gives the {it second but more comprehensive} adapted solution to a general BSRE via the DDP. Further extensions to the jump-diffusion control system and to the general nonlinear control system are possible.

التحسين والتحكم

Guaranteed Bounds for General Approximate Dynamic Programming

571 - Yajing Liu , Edwin K. P. Chong , Ali Pezeshki 2014

In this paper, we will develop a systematic approach to deriving guaranteed bounds for approximate dynamic programming (ADP) schemes in optimal control problems. Our approach is inspired by our recent results on bounding the performance of greedy str ategies in optimization of string-submodular functions over a finite horizon. The approach is to derive a string-submodular optimization problem, for which the optimal strategy is the optimal control solution and the greedy strategy is the ADP solution. Using this approach, we show that any ADP solution achieves a performance that is at least a factor of $beta$ of the performance of the optimal control solution, which satisfies Bellmans optimality principle. The factor $beta$ depends on the specific ADP scheme, as we will explicitly characterize. To illustrate the applicability of our bounding technique, we present examples of ADP schemes, including the popular rollout method.

التحسين والتحكم

Solving Splitted Multi-Commodity Flow Problem by Efficient Linear Programming Algorithm

134 - Liyun Dai , Hengjun Zhao , Zhiming Liu 2019

Column generation is often used to solve multi-commodity flow problems. A program for column generation always includes a module that solves a linear equation. In this paper, we address three major issues in solving linear problem during column gener ation procedure which are (1) how to employ the sparse property of the coefficient matrix; (2) how to reduce the size of the coefficient matrix; and (3) how to reuse the solution to a similar equation. To this end, we first analyze the sparse property of coefficient matrix of linear equations and find that the matrices occurring in iteration are very sparse. Then, we present an algorithm locSolver (for localized system solver) for linear equations with sparse coefficient matrices and right-hand-sides. This algorithm can reduce the number of variables. After that, we present the algorithm incSolver (for incremental system solver) which utilizes similarity in the iterations of the program for a linear equation system. All three techniques can be used in column generation of multi-commodity problems. Preliminary numerical experiments show that the incSolver is significantly faster than the existing algorithms. For example, random test cases show that incSolver is at least 37 times and up to 341 times faster than popular solver LAPACK.

التحسين والتحكم

Dynamic programming using radial basis functions

338 - Oliver Junge , Alex Schreiber 2014

We propose a discretization of the optimality principle in dynamic programming based on radial basis functions and Shepards moving least squares approximation method. We prove convergence of the approximate optimal value function to the true one and present several numerical experiments.

التحسين والتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الإتحاد الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Dynamic Programming and Linear Programming for Odds Problem

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً