Online Posted Pricing with Unknown Time-Discounted Valuations

90 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alberto Marchesi

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Giulia Romano - Gianluca Tartaglia - Alberto Marchesi

علوم الكمبيوتر ونظرية الألعاب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study the problem of designing posted-price mechanisms in order to sell a single unit of a single item within a finite period of time. Motivated by real-world problems, such as, e.g., long-term rental of rooms and apartments, we assume that customers arrive online according to a Poisson process, and their valuations are drawn from an unknown distribution and discounted over time. We evaluate our mechanisms in terms of competitive ratio, measuring the worst-case ratio between their revenue and that of an optimal mechanism that knows the distribution of valuations. First, we focus on the identical valuation setting, where all the customers value the item for the same amount. In this setting, we provide a mechanism M_c that achieves the best possible competitive ratio, discussing its dependency on the parameters in the case of linear discount. Then, we switch to the random valuation setting. We show that, if we restrict the attention to distributions of valuations with a monotone hazard rate, then the competitive ratio of M_c is lower bounded by a strictly positive constant that does not depend on the distribution. Moreover, we provide another mechanism, called M_pc, which is defined by a piecewise constant pricing strategy and reaches performances comparable to those obtained with M_c. This mechanism is useful when the seller cannot change the posted price too often. Finally, we empirically evaluate the performances of our mechanisms in a number of experimental settings.

قيم البحث

182 - Shuchi Chawla , Jason Hartline , Robert Kleinberg 2008

Algorithmic pricing is the computational problem that sellers (e.g., in supermarkets) face when trying to set prices for their items to maximize their profit in the presence of a known demand. Guruswami et al. (2005) propose this problem and give log arithmic approximations (in the number of consumers) when each consumers values for bundles are known precisely. Subsequently severa

علوم الكمبيوتر ونظرية الألعاب بنى وهياكل البيانات والخوارزميات

Stateful Posted Pricing with Vanishing Regret via Dynamic Deterministic Markov Decision Processes

122 - Yuval Emek , Ron Lavi , Rad Niazadeh 2020

In this paper, a rather general online problem called dynamic resource allocation with capacity constraints (DRACC) is introduced and studied in the realm of posted price mechanisms. This problem subsumes several applications of stateful pricing, inc luding but not limited to posted prices for online job scheduling and matching over a dynamic bipartite graph. As the existing online learning techniques do not yield vanishing-regret mechanisms for this problem, we develop a novel online learning framework defined over deterministic Markov decision processes with dynamic state transition and reward functions. We then prove that if the Markov decision process is guaranteed to admit an oracle that can simulate any given policy from any initial state with bounded loss -- a condition that is satisfied in the DRACC problem -- then the online learning problem can be solved with vanishing regret. Our proof technique is based on a reduction to online learning with switching cost, in which an online decision maker incurs an extra cost every time she switches from one arm to another. We formally demonstrate this connection and further show how DRACC can be used in our proposed applications of stateful pricing.

علوم الكمبيوتر ونظرية الألعاب

Asymptotically Optimal Welfare of Posted Pricing for Multiple Items with MHR Distributions

84 - Alexander Braun , Matthias Buttkus , Thomas Kesselheim 2021

We consider the problem of posting prices for unit-demand buyers if all $n$ buyers have identically distributed valuations drawn from a distribution with monotone hazard rate. We show that even with multiple items asymptotically optimal welfare can b e guaranteed. Our main results apply to the case that either a buyers value for different items are independent or that they are perfectly correlated. We give mechanisms using dynamic prices that obtain a $1 - Theta left( frac{1}{log n}right)$-fraction of the optimal social welfare in expectation. Furthermore, we devise mechanisms that only use static item prices and are $1 - Theta left( frac{logloglog n}{log n}right)$-competitive compared to the optimal social welfare. As we show, both guarantees are asymptotically optimal, even for a single item and exponential distributions.

علوم الكمبيوتر ونظرية الألعاب بنى وهياكل البيانات والخوارزميات

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

81 - Gabriele Farina , Tuomas Sandholm 2021

Regret minimization has proved to be a versatile tool for tree-form sequential decision making and extensive-form games. In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are curre ntly the practical state of the art for computing a Nash equilibrium. Most regret-minimization algorithms for tree-form sequential decision making, including CFR, require (i) an exact model of the players decision nodes, observation nodes, and how they are linked, and (ii) full knowledge, at all times t, about the payoffs -- even in parts of the decision space that are not encountered at time t. Recently, there has been growing interest towards relaxing some of those restrictions and making regret minimization applicable to settings for which reinforcement learning methods have traditionally been used -- for example, those in which only black-box access to the environment is available. We give the first, to our knowledge, regret-minimization algorithm that guarantees sublinear regret with high probability even when requirement (i) -- and thus also (ii) -- is dropped. We formalize an online learning setting in which the strategy space is not known to the agent and gets revealed incrementally whenever the agent encounters new decision points. We give an efficient algorithm that achieves $O(T^{3/4})$ regret with high probability for that setting, even when the agent faces an adversarial environment. Our experiments show it significantly outperforms the prior algorithms for the problem, which do not have such guarantees. It can be used in any application for which regret minimization is useful: approximating Nash equilibrium or quantal response equilibrium, approximating coarse correlated equilibrium in multi-player games, learning a best response, learning safe opponent exploitation, and online play against an unknown opponent/environment.

علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

Optimal Pricing in Networks with Externalities

133 - Ozan Candogan , Kostas Bimpikis , Asuman Ozdaglar 2011

We study the optimal pricing strategies of a monopolist selling a divisible good (service) to consumers that are embedded in a social network. A key feature of our model is that consumers experience a (positive) local network effect. In particular, e ach consumers usage level depends directly on the usage of her neighbors in the social network structure. Thus, the monopolists optimal pricing strategy may involve offering discounts to certain agents, who have a central position in the underlying network. First, we consider a setting where the monopolist can offer individualized prices and derive an explicit characterization of the optimal price for each consumer as a function of her network position. In particular, we show that it is optimal for the monopolist to charge each agent a price that is proportional to her Bonacich centrality in the social network. In the second part of the paper, we discuss the optimal strategy of a monopolist that can only choose a single uniform price for the good and derive an algorithm polynomial in the number of agents to compute such a price. Thirdly, we assume that the monopolist can offer the good in two prices, full and discounted, and study the problem of determining which set of consumers should be given the discount. We show that the problem is NP-hard, however we provide an explicit characterization of the set of agents that should be offered the discounted price. Next, we describe an approximation algorithm for finding the optimal set of agents. We show that if the profit is nonnegative under any feasible price allocation, the algorithm guarantees at least 88% of the optimal profit. Finally, we highlight the value of network information by comparing the profits of a monopolist that does not take into account the network effects when choosing her pricing policy to those of a monopolist that uses this information optimally.

علوم الكمبيوتر ونظرية الألعاب بنية الشبكات والإنترنت

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

معھد الشام العالي للعلوم الشرعية واللغة العربية والدراسات والبحوث الإسلامية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Online Posted Pricing with Unknown Time-Discounted Valuations

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً