Do you want to publish a course? Click here

Preference-based Teaching

69   0   0.0 ( 0 )
 Added by Ziyuan Gao
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

We introduce a new model of teaching named preference-based teaching and a corresponding complexity parameter---the preference-based teaching dimension (PBTD)---representing the worst-case number of examples needed to teach any concept in a given concept class. Although the PBTD coincides with the well-known recursive teaching dimension (RTD) on finite classes, it is radically different on infinite ones: the RTD becomes infinite already for trivial infinite classes (such as half-intervals) whereas the PBTD evaluates to reasonably small values for a wide collection of infinite classes including classes consisting of so-called closed sets w.r.t. a given closure operator, including various classes related to linear sets over $mathbb{N}_0$ (whose RTD had been studied quite recently) and including the class of Euclidean half-spaces. On top of presenting these concrete results, we provide the reader with a theoretical framework (of a combinatorial flavor) which helps to derive bounds on the PBTD.



rate research

Read More

Algorithmic machine teaching studies the interaction between a teacher and a learner where the teacher selects labeled examples aiming at teaching a target hypothesis. In a quest to lower teaching complexity, several teaching models and complexity measures have been proposed for both the batch settings (e.g., worst-case, recursive, preference-based, and non-clashing models) and the sequential settings (e.g., local preference-based model). To better understand the connections between these models, we develop a novel framework that captures the teaching process via preference functions $Sigma$. In our framework, each function $sigma in Sigma$ induces a teacher-learner pair with teaching complexity as $TD(sigma)$. We show that the above-mentioned teaching models are equivalent to specific types/families of preference functions. We analyze several properties of the teaching complexity parameter $TD(sigma)$ associated with different families of the preference functions, e.g., comparison to the VC dimension of the hypothesis class and additivity/sub-additivity of $TD(sigma)$ over disjoint domains. Finally, we identify preference functions inducing a novel family of sequential models with teaching complexity linear in the VC dimension: this is in contrast to the best-known complexity result for the batch models, which is quadratic in the VC dimension.
Effective techniques for eliciting user preferences have taken on added importance as recommender systems (RSs) become increasingly interactive and conversational. A common and conceptually appealing Bayesian criterion for selecting queries is expected value of information (EVOI). Unfortunately, it is computationally prohibitive to construct queries with maximum EVOI in RSs with large item spaces. We tackle this issue by introducing a continuous formulation of EVOI as a differentiable network that can be optimized using gradient methods available in modern machine learning (ML) computational frameworks (e.g., TensorFlow, PyTorch). We exploit this to develop a novel, scalable Monte Carlo method for EVOI optimization, which is more scalable for large item spaces than methods requiring explicit enumeration of items. While we emphasize the use of this approach for pairwise (or k-wise) comparisons of items, we also demonstrate how our method can be adapted to queries involving subsets of item attributes or partial items, which are often more cognitively manageable for users. Experiments show that our gradient-based EVOI technique achieves state-of-the-art performance across several domains while scaling to large item spaces.
The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more trajectories by the learned target distribution for the next preference selection. The theoretical analysis on the distribution learning and convergence of the proposed algorithm are provided. Extensive simulations on several benchmarks have been conducted for further verifying the effectiveness of the proposed method.
45 - Jer^ome Aze 2005
A key data preparation step in Text Mining, Term Extraction selects the terms, or collocation of words, attached to specific concepts. In this paper, the task of extracting relevant collocations is achieved through a supervised learning algorithm, exploiting a few collocations manually labelled as relevant/irrelevant. The candidate terms are described along 13 standard statistical criteria measures. From these examples, an evolutionary learning algorithm termed Roger, based on the optimization of the Area under the ROC curve criterion, extracts an order on the candidate terms. The robustness of the approach is demonstrated on two real-world domain applications, considering different domains (biology and human resources) and different languages (English and French).
We propose to assess the fairness of personalized recommender systems in the sense of envy-freeness: every (group of) user(s) should prefer their recommendations to the recommendations of other (groups of) users. Auditing for envy-freeness requires probing user preferences to detect potential blind spots, which may deteriorate recommendation performance. To control the cost of exploration, we propose an auditing algorithm based on pure exploration and conservative constraints in multi-armed bandits. We study, both theoretically and empirically, the trade-offs achieved by this algorithm.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا