Detection Accuracy for Evaluating Compositional Explanations of Units

99 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Biagio La Rosa

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sayo M. Makinwa - Biagio La Rosa - Roberto Capobianco

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The recent success of deep learning models in solving complex problems and in different domains has increased interest in understanding what they learn. Therefore, different approaches have been employed to explain these models, one of which uses human-understandable concepts as explanations. Two examples of methods that use this approach are Network Dissection and Compositional explanations. The former explains units using atomic concepts, while the latter makes explanations more expressive, replacing atomic concepts with logical forms. While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement, and their evaluation is often based on the same metric that is optimized during the search-process and on the usage of hyper-parameters to be tuned. In this paper, we propose to use as evaluation metric the Detection Accuracy, which measures units consistency of detection of their assigned explanations. We show that this metric (1) evaluates explanations of different lengths effectively, (2) can be used as a stopping criterion for the compositional explanation search, eliminating the explanation length hyper-parameter, and (3) exposes new specialized units whose length 1 explanations are the perceptual abstractions of their longer explanations.

قيم البحث

217 - Pau Rodriguez , Massimo Caccia , Alexandre Lacoste 2021

Explainability for machine learning models has gained considerable attention within our research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods in dicate how to perturb a models input to change its prediction, providing details about the models decision-making. Current counterfactual methods make ambiguous interpretations as they combine multiple biases of the model and the data in a single counterfactual interpretation of the models decision. Moreover, these methods tend to generate trivial counterfactuals about the models decision, as they often suggest to exaggerate or remove the presence of the attribute being classified. For the machine learning practitioner, these types of counterfactuals offer little value, since they provide no new information about undesired model or data biases. In this work, we propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss to uncover multiple valuable explanations about the models prediction. Further, we introduce a mechanism to prevent the model from producing trivial explanations. Experiments on CelebA and Synbols demonstrate that our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods. We will publish the code.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

On the Challenges of Evaluating Compositional Explanations in Multi-Hop Inference: Relevance, Completeness, and Expert Ratings

61 - Peter Jansen , Kelly Smith , Dan Moreno 2021

Building compositional explanations requires models to combine two or more facts that, together, describe why the answer to a question is correct. Typically, these multi-hop explanations are evaluated relative to one (or a small number of) gold expla nations. In this work, we show these evaluations substantially underestimate model performance, both in terms of the relevance of included facts, as well as the completeness of model-generated explanations, because models regularly discover and produce valid explanations that are different than gold explanations. To address this, we construct a large corpus of 126k domain-expert (science teacher) relevance ratings that augment a corpus of explanations to standardized science exam questions, discovering 80k additional relevant facts not rated as gold. We build three strong models based on different methodologies (generation, ranking, and schemas), and empirically show that while expert-augmented ratings provide better estimates of explanation quality, both original (gold) and expert-augmented automatic evaluations still substantially underestimate performance by up to 36% when compared with full manual expert judgements, with different models being disproportionately affected. This poses a significant methodological challenge to accurately evaluating explanations produced by compositional reasoning models.

الحساب واللغة الذكاء الاصطناعي

CoSE: Compositional Stroke Embeddings

368 - Emre Aksan , Thomas Deselaers , Andrea Tagliasacchi 2020

We present a generative model for complex free-form structures such as stroke-based drawing tasks. While previous approaches rely on sequence-based models for drawings of basic objects or handwritten text, we propose a model that treats drawings as a collection of strokes that can be composed into complex structures such as diagrams (e.g., flow-charts). At the core of the approach lies a novel autoencoder that projects variable-length strokes into a latent space of fixed dimension. This representation space allows a relational model, operating in latent space, to better capture the relationship between strokes and to predict subsequent strokes. We demonstrate qualitatively and quantitatively that our proposed approach is able to model the appearance of individual strokes, as well as the compositional structure of larger diagram drawings. Our approach is suitable for interactive use cases such as auto-completing diagrams. We make code and models publicly available at https://eth-ait.github.io/cose.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach

58 - Anirban Sarkar , Deepak Vijaykeerthy , Anindya Sarkar 2021

Self-explainable deep models are devised to represent the hidden concepts in the dataset without requiring any posthoc explanation generation technique. We worked with one of such models motivated by explicitly representing the classifier function as a linear function and showed that by exploiting probabilistic latent and properly modifying different parts of the model can result better explanation as well as provide superior predictive performance. Apart from standard visualization techniques, we proposed a new technique which can strengthen human understanding towards hidden concepts. We also proposed a technique of using two different self-supervision techniques to extract meaningful concepts related to the type of self-supervision considered and achieved significant performance boost. The most important aspect of our method is that it works nicely in a low data regime and reaches the desired accuracy in a few number of epochs. We reported exhaustive results with CIFAR10, CIFAR100, and AWA2 datasets to show effect of our method with moderate and relatively complex datasets.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Teaching Machine Comprehension with Compositional Explanations

115 - Qinyuan Ye , Xiao Huang , Elizabeth Boschee 2020

Advances in machine reading comprehension (MRC) rely heavily on the collection of large scale human-annotated examples in the form of (question, paragraph, answer) triples. In contrast, humans are typically able to generalize with only a few examples , relying on deeper underlying world knowledge, linguistic sophistication, and/or simply superior deductive powers. In this paper, we focus on teaching machines reading comprehension, using a small number of semi-structured explanations that explicitly inform machines why answer spans are correct. We extract structured variables and rules from explanations and compose neural module teachers that annotate instances for training downstream MRC models. We use learnable neural modules and soft logic to handle linguistic variation and overcome sparse coverage; the modules are jointly optimized with the MRC model to improve final performance. On the SQuAD dataset, our proposed method achieves 70.14% F1 score with supervision from 26 explanations, comparable to plain supervised learning using 1,100 labeled instances, yielding a 12x speed up.

الحساب واللغة التعلم الآلي