ترغب بنشر مسار تعليمي؟ اضغط هنا

The Ratio Index for Budgeted Learning, with Applications

728   0   0.0 ( 0 )
 نشر من قبل Ashish Goel
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In the budgeted learning problem, we are allowed to experiment on a set of alternatives (given a fixed experimentation budget) with the goal of picking a single alternative with the largest possible expected payoff. Approximation algorithms for this problem were developed by Guha and Munagala by rounding a linear program that couples the various alternatives together. In this paper we present an index for this problem, which we call the ratio index, which also guarantees a constant factor approximation. Index-based policies have the advantage that a single number (i.e. the index) can be computed for each alternative irrespective of all other alternatives, and the alternative with the highest index is experimented upon. This is analogous to the famous Gittins index for the discounted multi-armed bandit problem. The ratio index has several interesting structural properties. First, we show that it can be computed in strongly polynomial time. Second, we show that with the appropriate discount factor, the Gittins index and our ratio index are constant factor approximations of each other, and hence the Gittins index also gives a constant factor approximation to the budgeted learning problem. Finally, we show that the ratio index can be used to create an index-based policy that achieves an O(1)-approximation for the finite horizon version of the multi-armed bandit problem. Moreover, the policy does not require any knowledge of the horizon (whereas we compare its performance against an optimal strategy that is aware of the horizon). This yields the following surprising result: there is an index-based policy that achieves an O(1)-approximation for the multi-armed bandit problem, oblivious to the underlying discount factor.



قيم البحث

اقرأ أيضاً

We study approximation algorithms for the problem of minimizing the makespan on a set of machines with uncertainty on the processing times of jobs. In the model we consider, which goes back to~cite{BertsimasS03}, once the schedule is defined an adver sary can pick a scenario where deviation is added to some of the jobs processing times. Given only the maximal cardinality of these jobs, and the magnitude of potential deviation for each job, the goal is to optimize the worst-case scenario. We consider both the cases of identical and unrelated machines. Our main result is an EPTAS for the case of identical machines. We also provide a $3$-approximation algorithm and an inapproximability ratio of $2-epsilon$ for the case of unrelated machines
We tackle two long-standing problems related to re-expansions in heuristic search algorithms. For graph search, A* can require $Omega(2^{n})$ expansions, where $n$ is the number of states within the final $f$ bound. Existing algorithms that address t his problem like B and B improve this bound to $Omega(n^2)$. For tree search, IDA* can also require $Omega(n^2)$ expansions. We describe a new algorithmic framework that iteratively controls an expansion budget and solution cost limit, giving rise to new graph and tree search algorithms for which the number of expansions is $O(n log C)$, where $C$ is the optimal solution cost. Our experiments show that the new algorithms are robust in scenarios where existing algorithms fail. In the case of tree search, our new algorithms have no overhead over IDA* in scenarios to which IDA* is well suited and can therefore be recommended as a general replacement for IDA*.
We study the {em Budgeted Dominating Set} (BDS) problem on uncertain graphs, namely, graphs with a probability distribution $p$ associated with the edges, such that an edge $e$ exists in the graph with probability $p(e)$. The input to the problem con sists of a vertex-weighted uncertain graph $G=(V, E, p, omega)$ and an integer {em budget} (or {em solution size}) $k$, and the objective is to compute a vertex set $S$ of size $k$ that maximizes the expected total domination (or total weight) of vertices in the closed neighborhood of $S$. We refer to the problem as the {em Probabilistic Budgeted Dominating Set}~(PBDS) problem and present the following results. begin{enumerate} dnsitem We show that the PBDS problem is NP-complete even when restricted to uncertain {em trees} of diameter at most four. This is in sharp contrast with the well-known fact that the BDS problem is solvable in polynomial time in trees. We further show that PBDS is wone-hard for the budget parameter $k$, and under the {em Exponential time hypothesis} it cannot be solved in $n^{o(k)}$ time. item We show that if one is willing to settle for $(1-epsilon)$ approximation, then there exists a PTAS for PBDS on trees. Moreover, for the scenario of uniform edge-probabilities, the problem can be solved optimally in polynomial time. item We consider the parameterized complexity of the PBDS problem, and show that Uni-PBDS (where all edge probabilities are identical) is wone-hard for the parameter pathwidth. On the other hand, we show that it is FPT in the combined parameters of the budget $k$ and the treewidth. item Finally, we extend some of our parameterized results to planar and apex-minor-free graphs. end{enumerate}
In real-world machine learning applications, there is a cost associated with sampling of different features. Budgeted learning can be used to select which feature-values to acquire from each instance in a dataset, such that the best model is induced under a given constraint. However, this approach is not possible in the domain of online learning since one may not retroactively acquire feature-values from past instances. In online learning, the challenge is to find the optimum set of features to be acquired from each instance upon arrival from a data stream. In this paper we introduce the issue of online budgeted learning and describe a general framework for addressing this challenge. We propose two types of feature value acquisition policies based on the multi-armed bandit problem: random and adaptive. Adaptive policies perform online adjustments according to new information coming from a data stream, while random policies are not sensitive to the information that arrives from the data stream. Our comparative study on five real-world datasets indicates that adaptive policies outperform random policies for most budget limitations and datasets. Furthermore, we found that in some cases adaptive policies achieve near-optimal results.
The problem of finding factors of a text string which are identical or similar to a given pattern string is a central problem in computer science. A generalised version of this problem consists in implementing an index over the text to support effici ent on-line pattern queries. We study this problem in the case where the text is weighted: for every position of the text and every letter of the alphabet a probability of occurrence of this letter at this position is given. Sequences of this type, also called position weight matrices, are commonly used to represent imprecise or uncertain data. A weighted sequence may represent many different strings, each with probability of occurrence equal to the product of probabilities of its letters at subsequent positions. Given a probability threshold $1/z$, we say that a pattern string $P$ matches a weighted text at position $i$ if the product of probabilities of the letters of $P$ at positions $i,ldots,i+|P|-1$ in the text is at least $1/z$. In this article, we present an $O(nz)$-time construction of an $O(nz)$-sized index that can answer pattern matching queries in a weighted text in optimal time improving upon the state of the art by a factor of $z log z$. Other applications of this data structure include an $O(nz)$-time construction of the weighted prefix table and an $O(nz)$-time computation of all covers of a weighted sequence, which improve upon the state of the art by the same factor.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا