Online Linear Programming: Dual Convergence, New Algorithms, and Regret Bounds

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xiaocheng Li

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xiaocheng Li - Yinyu Ye

بنى وهياكل البيانات والخوارزميات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study an online linear programming (OLP) problem under a random input model in which the columns of the constraint matrix along with the corresponding coefficients in the objective function are generated i.i.d. from an unknown distribution and revealed sequentially over time. Virtually all pre-existing online algorithms were based on learning the dual optimal solutions/prices of the linear programs (LP), and their analyses were focused on the aggregate objective value and solving the packing LP where all coefficients in the constraint matrix and objective are nonnegative. However, two major open questions were: (i) Does the set of LP optimal dual prices learned in the pre-existing algorithms converge to those of the offline LP, and (ii) Could the results be extended to general LP problems where the coefficients can be either positive or negative. We resolve these two questions by establishing convergence results for the dual prices under moderate regularity conditions for general LP problems. Specifically, we identify an equivalent form of the dual problem which relates the dual LP with a sample average approximation to a stochastic program. Furthermore, we propose a new type of OLP algorithm, Action-History-Dependent Learning Algorithm, which improves the previous algorithm performances by taking into account the past input data as well as decisions/actions already made. We derive an $O(log n log log n)$ regret bound (under a locally strong convexity and smoothness condition) for the proposed algorithm, against the $O(sqrt{n})$ bound for typical dual-price learning algorithms, where $n$ is the number of decision variables. Numerical experiments demonstrate the effectiveness of the proposed algorithm and the action-history-dependent design.

قيم البحث

اقرأ أيضاً

Regret Bounds for Batched Bandits

123 - Hossein Esfandiari , Amin Karbasi , Abbas Mehrabian 2019

We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number of batche s. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also study the batched adversarial multi-armed bandit problem for the first time and find the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Online Paging with a Vanishing Regret

115 - Yuval Emek , Shay Kutten , Yangguang Shi 2020

This paper considers a variant of the online paging problem, where the online algorithm has access to multiple predictors, each producing a sequence of predictions for the page arrival times. The predictors may have occasional prediction errors and i t is assumed that at least one of them makes a sublinear number of prediction errors in total. Our main result states that this assumption suffices for the design of a randomized online algorithm whose time-average regret with respect to the optimal offline algorithm tends to zero as the time tends to infinity. This holds (with different regret bounds) for both the full information access model, where in each round, the online algorithm gets the predictions of all predictors, and the bandit access model, where in each round, the online algorithm queries a single predictor. While online algorithms that exploit inaccurate predictions have been a topic of growing interest in the last few years, to the best of our knowledge, this is the first paper that studies this topic in the context of multiple predictors for an online problem with unbounded request sequences. Moreover, to the best of our knowledge, this is also the first paper that aims for (and achieves) online algorithms with a vanishing regret for a classic online problem under reasonable assumptions.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Simple and Fast Algorithm for Binary Integer and Online Linear Programming

129 - Xiaocheng Li , Chunlin Sun , Yinyu Ye 2020

In this paper, we develop a simple and fast online algorithm for solving a class of binary integer linear programs (LPs) arisen in general resource allocation problem. The algorithm requires only one single pass through the input data and is free of doing any matrix inversion. It can be viewed as both an approximate algorithm for solving binary integer LPs and a fast algorithm for solving online LP problems. The algorithm is inspired by an equivalent form of the dual problem of the relaxed LP and it essentially performs (one-pass) projected stochastic subgradient descent in the dual space. We analyze the algorithm in two different models, stochastic input and random permutation, with minimal technical assumptions on the input data. The algorithm achieves $Oleft(m sqrt{n}right)$ expected regret under the stochastic input model and $Oleft((m+log n)sqrt{n}right)$ expected regret under the random permutation model, and it achieves $O(m sqrt{n})$ expected constraint violation under both models, where $n$ is the number of decision variables and $m$ is the number of constraints. The algorithm enjoys the same performance guarantee when generalized to a multi-dimensional LP setting which covers a wider range of applications. In addition, we employ the notion of permutational Rademacher complexity and derive regret bounds for two earlier online LP algorithms for comparison. Both algorithms improve the regret bound with a factor of $sqrt{m}$ by paying more computational cost. Furthermore, we demonstrate how to convert the possibly infeasible solution to a feasible one through a randomized procedure. Numerical experiments illustrate the general applicability and effectiveness of the algorithms.

بنى وهياكل البيانات والخوارزميات التحسين والتحكم

New Results and Bounds on Online Facility Assignment Problem

100 - Saad Al Muttakee , Abu Reyan Ahmed , Md. Saidur Rahman 2020

Consider an online facility assignment problem where a set of facilities $F = { f_1, f_2, f_3, cdots, f_{|F|} }$ of equal capacity $l$ is situated on a metric space and customers arrive one by one in an online manner on that space. We assign a custom er $c_i$ to a facility $f_j$ before a new customer $c_{i+1}$ arrives. The cost of this assignment is the distance between $c_i$ and $f_j$. The objective of this problem is to minimize the sum of all assignment costs. Recently Ahmed et al. (TCS, 806, pp. 455-467, 2020) studied the problem where the facilities are situated on a line and computed competitive ratio of Algorithm Greedy which assigns the customer to the nearest available facility. They computed competitive ratio of algorithm named Algorithm Optimal-Fill which assigns the new customer considering optimal assignment of all previous customers. They also studied the problem where the facilities are situated on a connected unweighted graph. In this paper we first consider that $F$ is situated on the vertices of a connected unweighted grid graph $G$ of size $r times c$ and customers arrive one by one having positions on the vertices of $G$. We show that Algorithm Greedy has competitive ratio $r times c + r + c$ and Algorithm Optimal-Fill has competitive ratio $O(r times c)$. We later show that the competitive ratio of Algorithm Optimal-Fill is $2|F|$ for any arbitrary graph. Our bound is tight and better than the previous result. We also consider the facilities are distributed arbitrarily on a plane and provide an algorithm for the scenario. We also provide an algorithm that has competitive ratio $(2n-1)$. Finally, we consider a straight line metric space and show that no algorithm for the online facility assignment problem has competitive ratio less than $9.001$.

بنى وهياكل البيانات والخوارزميات علوم الكمبيوتر ونظرية الألعاب

Bounds for Approximate Regret-Matching Algorithms

155 - Ryan DOrazio , Dustin Morrill , James R. Wright 2019

A dominant approach to solving large imperfect-information games is Counterfactural Regret Minimization (CFR). In CFR, many regret minimization problems are combined to solve the game. For very large games, abstraction is typically needed to render C FR tractable. Abstractions are often manually tuned, possibly removing important strategic differences in the full game and harming performance. Function approximation provides a natural solution to finding good abstractions to approximate the full game. A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems. This paper gives regret bounds when a regret minimizing algorithm uses estimates instead of true values. This form of analysis is the first to generalize to a larger class of $(Phi, f)$-regret matching algorithms, and includes different forms of regret such as swap, internal, and external regret. We demonstrate how these results give a slightly tighter bound for Regression Regret-Matching (RRM), and present a novel bound for combining regression with Hedge.

التعلم الآلي علوم الكمبيوتر ونظرية الألعاب