ترغب بنشر مسار تعليمي؟ اضغط هنا

RoxyBot-06: Stochastic Prediction and Optimization in TAC Travel

254   0   0.0 ( 0 )
 نشر من قبل Amy Greenwald
 تاريخ النشر 2014
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper, we describe our autonomous bidding agent, RoxyBot, who emerged victorious in the travel division of the 2006 Trading Agent Competition in a photo finish. At a high level, the design of many successful trading agents can be summarized as follows: (i) price prediction: build a model of market prices; and (ii) optimization: solve for an approximately optimal set of bids, given this model. To predict, RoxyBot builds a stochastic model of market prices by simulating simultaneous ascending auctions. To optimize, RoxyBot relies on the sample average approximation method, a stochastic optimization technique.



قيم البحث

اقرأ أيضاً

We study the problem of an online advertising system that wants to optimally spend an advertisers given budget for a campaign across multiple platforms, without knowing the value for showing an ad to the users on those platforms. We model this challe nging practical application as a Stochastic Bandits with Knapsacks problem over $T$ rounds of bidding with the set of arms given by the set of distinct bidding $m$-tuples, where $m$ is the number of platforms. We modify the algorithm proposed in Badanidiyuru emph{et al.,} to extend it to the case of multiple platforms to obtain an algorithm for both the discrete and continuous bid-spaces. Namely, for discrete bid spaces we give an algorithm with regret $Oleft(OPT sqrt {frac{mn}{B} }+ sqrt{mn OPT}right)$, where $OPT$ is the performance of the optimal algorithm that knows the distributions. For continuous bid spaces the regret of our algorithm is $tilde{O}left(m^{1/3} cdot minleft{ B^{2/3}, (m T)^{2/3} right} right)$. When restricted to this special-case, this bound improves over Sankararaman and Slivkins in the regime $OPT ll T$, as is the case in the particular application at hand. Second, we show an $ Omegaleft (sqrt {m OPT} right)$ lower bound for the discrete case and an $Omegaleft( m^{1/3} B^{2/3}right)$ lower bound for the continuous setting, almost matching the upper bounds. Finally, we use a real-world data set from a large internet online advertising company with multiple ad platforms and show that our algorithms outperform common benchmarks and satisfy the required properties warranted in the real-world application.
This paper presents a new partial two-player game, called the emph{cannibal animal game}, which is a variant of Tic-Tac-Toe. The game is played on the infinite grid, where in each round a player chooses and occupies free cells. The first player Alice can occupy a cell in each turn and wins if she occupies a set of cells, the union of a subset of which is a translated, reflected and/or rotated copy of a previously agreed upon polyomino $P$ (called an emph{animal}). The objective of the second player Bob is to prevent Alice from creating her animal by occupying in each round a translated, reflected and/or rotated copy of $P$. An animal is a emph{cannibal} if Bob has a winning strategy, and a emph{non-cannibal} otherwise. This paper presents some new tools, such as the emph{bounding strategy} and the emph{punching lemma}, to classify animals into cannibals or non-cannibals. We also show that the emph{pairing strategy} works for this problem.
We develop a computationally efficient technique to solve a fairly general distributed service provision problem with selfish users and imperfect information. In particular, in a context in which the service capacity of the existing infrastructure ca n be partially adapted to the user load by activating just some of the service units, we aim at finding the configuration of active service units that achieves the best trade-off between maintenance (e.g. energetic) costs for the provider and user satisfaction. The core of our technique resides in the implementation of a belief-propagation (BP) algorithm to evaluate the cost configurations. Numerical results confirm the effectiveness of our approach.
We present fictitious play dynamics for stochastic games and analyze its convergence properties in zero-sum stochastic games. Our dynamics involves players forming beliefs on opponent strategy and their own continuation payoff (Q-function), and playi ng a greedy best response using estimated continuation payoffs. Players update their beliefs from observations of opponent actions. A key property of the learning dynamics is that update of the beliefs on Q-functions occurs at a slower timescale than update of the beliefs on strategies. We show both in the model-based and model-free cases (without knowledge of player payoff functions and state transition probabilities), the beliefs on strategies converge to a stationary mixed Nash equilibrium of the zero-sum stochastic game.
A case study of the Singapore road network provides empirical evidence that road pricing can significantly affect commuter trip timing behaviors. In this paper, we propose a model of trip timing decisions that reasonably matches the observed commuter s behaviors. Our model explicitly captures the difference in individuals sensitivity to price, travel time and early or late arrival at destination. New pricing schemes are suggested to better spread peak travel and reduce traffic congestion. Simulation results based on the proposed model are provided in comparison with the real data for the Singapore case study.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا