How Linear Reward Helps in Online Resource Allocation


الملخص بالإنكليزية

In this paper, we consider an online stochastic resource allocation problem which takes a linear program as its underlying form. We analyze an adaptive allocation algorithm and derives a constant regret bound that is not dependent on the number of time periods (number of decision variables) under the condition that the objective coefficient of the linear program is linear in the corresponding constraint coefficients. Furthermore, the constant regret bound does not assume the knowledge of underlying distribution.

تحميل البحث