ترغب بنشر مسار تعليمي؟ اضغط هنا

Online Posted Pricing with Unknown Time-Discounted Valuations

90   0   0.0 ( 0 )
 نشر من قبل Alberto Marchesi
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the problem of designing posted-price mechanisms in order to sell a single unit of a single item within a finite period of time. Motivated by real-world problems, such as, e.g., long-term rental of rooms and apartments, we assume that customers arrive online according to a Poisson process, and their valuations are drawn from an unknown distribution and discounted over time. We evaluate our mechanisms in terms of competitive ratio, measuring the worst-case ratio between their revenue and that of an optimal mechanism that knows the distribution of valuations. First, we focus on the identical valuation setting, where all the customers value the item for the same amount. In this setting, we provide a mechanism M_c that achieves the best possible competitive ratio, discussing its dependency on the parameters in the case of linear discount. Then, we switch to the random valuation setting. We show that, if we restrict the attention to distributions of valuations with a monotone hazard rate, then the competitive ratio of M_c is lower bounded by a strictly positive constant that does not depend on the distribution. Moreover, we provide another mechanism, called M_pc, which is defined by a piecewise constant pricing strategy and reaches performances comparable to those obtained with M_c. This mechanism is useful when the seller cannot change the posted price too often. Finally, we empirically evaluate the performances of our mechanisms in a number of experimental settings.



قيم البحث

اقرأ أيضاً

Algorithmic pricing is the computational problem that sellers (e.g., in supermarkets) face when trying to set prices for their items to maximize their profit in the presence of a known demand. Guruswami et al. (2005) propose this problem and give log arithmic approximations (in the number of consumers) when each consumers values for bundles are known precisely. Subsequently severa
In this paper, a rather general online problem called dynamic resource allocation with capacity constraints (DRACC) is introduced and studied in the realm of posted price mechanisms. This problem subsumes several applications of stateful pricing, inc luding but not limited to posted prices for online job scheduling and matching over a dynamic bipartite graph. As the existing online learning techniques do not yield vanishing-regret mechanisms for this problem, we develop a novel online learning framework defined over deterministic Markov decision processes with dynamic state transition and reward functions. We then prove that if the Markov decision process is guaranteed to admit an oracle that can simulate any given policy from any initial state with bounded loss -- a condition that is satisfied in the DRACC problem -- then the online learning problem can be solved with vanishing regret. Our proof technique is based on a reduction to online learning with switching cost, in which an online decision maker incurs an extra cost every time she switches from one arm to another. We formally demonstrate this connection and further show how DRACC can be used in our proposed applications of stateful pricing.
We consider the problem of posting prices for unit-demand buyers if all $n$ buyers have identically distributed valuations drawn from a distribution with monotone hazard rate. We show that even with multiple items asymptotically optimal welfare can b e guaranteed. Our main results apply to the case that either a buyers value for different items are independent or that they are perfectly correlated. We give mechanisms using dynamic prices that obtain a $1 - Theta left( frac{1}{log n}right)$-fraction of the optimal social welfare in expectation. Furthermore, we devise mechanisms that only use static item prices and are $1 - Theta left( frac{logloglog n}{log n}right)$-competitive compared to the optimal social welfare. As we show, both guarantees are asymptotically optimal, even for a single item and exponential distributions.
Regret minimization has proved to be a versatile tool for tree-form sequential decision making and extensive-form games. In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are curre ntly the practical state of the art for computing a Nash equilibrium. Most regret-minimization algorithms for tree-form sequential decision making, including CFR, require (i) an exact model of the players decision nodes, observation nodes, and how they are linked, and (ii) full knowledge, at all times t, about the payoffs -- even in parts of the decision space that are not encountered at time t. Recently, there has been growing interest towards relaxing some of those restrictions and making regret minimization applicable to settings for which reinforcement learning methods have traditionally been used -- for example, those in which only black-box access to the environment is available. We give the first, to our knowledge, regret-minimization algorithm that guarantees sublinear regret with high probability even when requirement (i) -- and thus also (ii) -- is dropped. We formalize an online learning setting in which the strategy space is not known to the agent and gets revealed incrementally whenever the agent encounters new decision points. We give an efficient algorithm that achieves $O(T^{3/4})$ regret with high probability for that setting, even when the agent faces an adversarial environment. Our experiments show it significantly outperforms the prior algorithms for the problem, which do not have such guarantees. It can be used in any application for which regret minimization is useful: approximating Nash equilibrium or quantal response equilibrium, approximating coarse correlated equilibrium in multi-player games, learning a best response, learning safe opponent exploitation, and online play against an unknown opponent/environment.
We study the optimal pricing strategies of a monopolist selling a divisible good (service) to consumers that are embedded in a social network. A key feature of our model is that consumers experience a (positive) local network effect. In particular, e ach consumers usage level depends directly on the usage of her neighbors in the social network structure. Thus, the monopolists optimal pricing strategy may involve offering discounts to certain agents, who have a central position in the underlying network. First, we consider a setting where the monopolist can offer individualized prices and derive an explicit characterization of the optimal price for each consumer as a function of her network position. In particular, we show that it is optimal for the monopolist to charge each agent a price that is proportional to her Bonacich centrality in the social network. In the second part of the paper, we discuss the optimal strategy of a monopolist that can only choose a single uniform price for the good and derive an algorithm polynomial in the number of agents to compute such a price. Thirdly, we assume that the monopolist can offer the good in two prices, full and discounted, and study the problem of determining which set of consumers should be given the discount. We show that the problem is NP-hard, however we provide an explicit characterization of the set of agents that should be offered the discounted price. Next, we describe an approximation algorithm for finding the optimal set of agents. We show that if the profit is nonnegative under any feasible price allocation, the algorithm guarantees at least 88% of the optimal profit. Finally, we highlight the value of network information by comparing the profits of a monopolist that does not take into account the network effects when choosing her pricing policy to those of a monopolist that uses this information optimally.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا