ﻻ يوجد ملخص باللغة العربية
We design mechanisms for online procurement of data held by strategic agents for machine learning tasks. The challenge is to use past data to actively price future data and give learning guarantees even when an agents cost for revealing her data may depend arbitrarily on the data itself. We achieve this goal by showing how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. Our results in a sense parallel classic sample complexity guarantees, but with the key resource being money rather than quantity of data: With a budget constraint $B$, we give robust risk (predictive error) bounds on the order of $1/sqrt{B}$. Because we use an active approach, we can often guarantee to do significantly better by leveraging correlations between costs and data. Our algorithms and analysis go through a model of no-regret learning with $T$ arriving pairs (cost, data) and a budget constraint of $B$. Our regret bounds for this model are on the order of $T/sqrt{B}$ and we give lower bounds on the same order.
Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward informati
While game theory is widely used to model strategic interactions, a natural question is where do the game representations come from? One answer is to learn the representations from data. If one wants to learn both the payoffs and the players strategi
A key question in cooperative game theory is that of coalitional stability, usually captured by the notion of the emph{core}--the set of outcomes such that no subgroup of players has an incentive to deviate. However, some coalitional games have empty
Making decisions in the presence of a strategic opponent requires one to take into account the opponents ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement le
It is known that there are uncoupled learning heuristics leading to Nash equilibrium in all finite games. Why should players use such learning heuristics and where could they come from? We show that there is no uncoupled learning heuristic leading to