Model-specific Data Subsampling with Influence Functions

80 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Anant Raj

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Anant Raj - Cameron Musco - Lester Mackey

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the datasets of interest are increasing in size. As a result, the process of model selection is time-consuming and computationally inefficient. In this work, we develop a model-specific data subsampling strategy that improves over random sampling whenever training points have varying influence. Specifically, we leverage influence functions to guide our selection strategy, proving theoretically, and demonstrating empirically that our approach quickly selects high-quality models.

قيم البحث

220 - Zhun Deng , Cynthia Dwork , Jialiang Wang 2020

Robust optimization has been widely used in nowadays data science, especially in adversarial training. However, little research has been done to quantify how robust optimization changes the optimizers and the prediction losses comparing to standard t raining. In this paper, inspired by the influence function in robust statistics, we introduce the Adversarial Influence Function (AIF) as a tool to investigate the solution produced by robust optimization. The proposed AIF enjoys a closed-form and can be calculated efficiently. To illustrate the usage of AIF, we apply it to study model sensitivity -- a quantity defined to capture the change of prediction losses on the natural data after implementing robust optimization. We use AIF to analyze how model complexity and randomized smoothing affect the model sensitivity with respect to specific models. We further derive AIF for kernel regressions, with a particular application to neural tangent kernels, and experimentally demonstrate the effectiveness of the proposed AIF. Lastly, the theories of AIF will be extended to distributional robust optimization.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Data Cleansing for Deep Neural Networks with Storage-efficient Approximation of Influence Functions

188 - Kenji Suzuki , Yoshiyuki Kobayashi , Takuya Narihira 2021

Identifying the influence of training data for data cleansing can improve the accuracy of deep learning. An approach with stochastic gradient descent (SGD) called SGD-influence to calculate the influence scores was proposed, but, the calculation cost s are expensive. It is necessary to temporally store the parameters of the model during training phase for inference phase to calculate influence sores. In close connection with the previous method, we propose a method to reduce cache files to store the parameters in training phase for calculating inference score. We only adopt the final parameters in last epoch for influence functions calculation. In our experiments on classification, the cache size of training using MNIST dataset with our approach is 1.236 MB. On the other hand, the previous method used cache size of 1.932 GB in last epoch. It means that cache size has been reduced to 1/1,563. We also observed the accuracy improvement by data cleansing with removal of negatively influential data using our approach as well as the previous method. Moreover, our simple and general proposed method to calculate influence scores is available on our auto ML tool without programing, Neural Network Console. The source code is also available.

التعلم الآلي الذكاء الاصطناعي

Score Matching Model for Unbounded Data Score

93 - Dongjun Kim , Seungjae Shin , Kyungwoo Song 2021

Recent advance in diffusion models incorporates the Stochastic Differential Equation (SDE), which brings the state-of-the art performance on image generation tasks. This paper improves such diffusion models by analyzing the model at the zero diffusio n time. In real datasets, the score function diverges as the diffusion time ($t$) decreases to zero, and this observation leads an argument that the score estimation fails at $t=0$ with any neural network structure. Subsequently, we introduce Unbounded Diffusion Model (UDM) that resolves the score diverging problem with an easily applicable modification to any diffusion models. Additionally, we introduce a new SDE that overcomes the theoretic and practical limitations of Variance Exploding SDE. On top of that, the introduced Soft Truncation method improves the sample quality by mitigating the loss scale issue that happens at $t=0$. We further provide a theoretic result of the proposed method to uncover the behind mechanism of the diffusion models.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Human-interpretable model explainability on high-dimensional data

80 - Damien de Mijolla , Christopher Frye , Markus Kunesch 2020

The importance of explainability in machine learning continues to grow, as both neural-network architectures and the data they model become increasingly complex. Unique challenges arise when a models input features become high dimensional: on one han d, principled model-agnostic approaches to explainability become too computationally expensive; on the other, more efficient explainability algorithms lack natural interpretations for general users. In this work, we introduce a framework for human-interpretable explainability on high-dimensional data, consisting of two modules. First, we apply a semantically meaningful latent representation, both to reduce the raw dimensionality of the data, and to ensure its human interpretability. These latent features can be learnt, e.g. explicitly as disentangled representations or implicitly through image-to-image translation, or they can be based on any computable quantities the user chooses. Second, we adapt the Shapley paradigm for model-agnostic explainability to operate on these latent features. This leads to interpretable model explanations that are both theoretically controlled and computationally tractable. We benchmark our approach on synthetic data and demonstrate its effectiveness on several image-classification tasks.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback

75 - Zheng Wen , Branislav Kveton , Michal Valko 2016

We study the online influence maximization problem in social networks under the independent cascade model. Specifically, we aim to learn the set of best influencers in a social network online while repeatedly interacting with it. We address the chall enges of (i) combinatorial action space, since the number of feasible influencer sets grows exponentially with the maximum number of influencers, and (ii) limited feedback, since only the influenced portion of the network is observed. Under a stochastic semi-bandit feedback, we propose and analyze IMLinUCB, a computationally efficient UCB-based algorithm. Our bounds on the cumulative regret are polynomial in all quantities of interest, achieve near-optimal dependence on the number of interactions and reflect the topology of the network and the activation probabilities of its edges, thereby giving insights on the problem complexity. To the best of our knowledge, these are the first such results. Our experiments show that in several representative graph topologies, the regret of IMLinUCB scales as suggested by our upper bounds. IMLinUCB permits linear generalization and thus is both statistically and computationally suitable for large-scale problems. Our experiments also show that IMLinUCB with linear generalization can lead to low regret in real-world online influence maximization.

التعلم الآلي الذكاء الاصطناعي الشبكات الاجتماعية والمعلومات