No Arabic abstract
We tackle the challenge of in-session attribution for on-site search engines in eCommerce. We phrase the problem as a causal counterfactual inference, and contrast the approach with rule-based systems from industry settings and prediction models from the multi-touch attribution literature. We approach counterfactuals in analogy with treatments in formal semantics, explicitly modeling possible outcomes through alternative shopper timelines; in particular, we propose to learn a generative browsing model over a target shop, leveraging the latent space induced by prod2vec embeddings; we show how natural language queries can be effectively represented in the same space and how search intervention can be performed to assess causal contribution. Finally, we validate the methodology on a synthetic dataset, mimicking important patterns emerged in customer interviews and qualitative analysis, and we present preliminary findings on an industry dataset from a partnering shop.
Feature attributions and counterfactual explanations are popular approaches to explain a ML model. The former assigns an importance score to each input feature, while the latter provides input examples with minimal changes to alter the models predictions. To unify these approaches, we provide an interpretation based on the actual causality framework and present two key results in terms of their use. First, we present a method to generate feature attribution explanations from a set of counterfactual examples. These feature attributions convey how important a feature is to changing the classification outcome of a model, especially on whether a subset of features is necessary and/or sufficient for that change, which attribution-based methods are unable to provide. Second, we show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency. As a result, we highlight the complementarity of these two approaches. Our evaluation on three benchmark datasets - Adult-Income, LendingClub, and German-Credit - confirms the complementarity. Feature attribution methods like LIME and SHAP and counterfactual explanation methods like Wachter et al. and DiCE often do not agree on feature importance rankings. In addition, by restricting the features that can be modified for generating counterfactual examples, we find that the top-k features from LIME or SHAP are often neither necessary nor sufficient explanations of a models prediction. Finally, we present a case study of different explanation methods on a real-world hospital triage problem
Counterfactual frameworks have grown popular in explainable and fair machine learning, as they offer a natural notion of causation. However, state-of-the-art models to compute counterfactuals are either unrealistic or unfeasible. In particular, while Pearls causal inference provides appealing rules to calculate counterfactuals, it relies on a model that is unknown and hard to discover in practice. We address the problem of designing realistic and feasible counterfactuals in the absence of a causal model. We define transport-based counterfactual models as collections of joint probability distributions between observable distributions, and show their connection to causal counterfactuals. More specifically, we argue that optimal transport theory defines relevant transport-based counterfactual models, as they are numerically feasible, statistically-faithful, and can even coincide with causal counterfactual models. We illustrate the practicality of these models by defining sharper fairness criteria than typical group fairness conditions.
We present DEGARI (Dynamic Emotion Generator And ReclassIfier), an explainable system for emotion attribution and recommendation. This system relies on a recently introduced commonsense reasoning framework, the TCL logic, which is based on a human-like procedure for the automatic generation of novel concepts in a Description Logics knowledge base. Starting from an ontological formalization of emotions based on the Plutchik model, known as ArsEmotica, the system exploits the logic TCL to automatically generate novel commonsense semantic representations of compound emotions (e.g. Love as derived from the combination of Joy and Trust according to Plutchik). The generated emotions correspond to prototypes, i.e. commonsense representations of given concepts, and have been used to reclassify emotion-related contents in a variety of artistic domains, ranging from art datasets to the editorial contents available in RaiPlay, the online platform of RAI Radiotelevisione Italiana (the Italian public broadcasting company). We show how the reported results (evaluated in the light of the obtained reclassifications, the user ratings assigned to such reclassifications, and their explainability) are encouraging, and pave the way to many further research directions.
Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding social biases found in web corpora. In this work, we study data and models associated with multilabel object classification and visual semantic role labeling. We find that (a) datasets for these tasks contain significant gender bias and (b) models trained on these datasets further amplify existing bias. For example, the activity cooking is over 33% more likely to involve females than males in a training set, and a trained model further amplifies the disparity to 68% at test time. We propose to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference. Our method results in almost no performance loss for the underlying recognition task but decreases the magnitude of bias amplification by 47.5% and 40.5% for multilabel classification and visual semantic role labeling, respectively.
Topology-changing transitions between vacua of different effective dimensionality are studied in the context of a 6-dimensional Einstein-Maxwell theory. The landscape of this theory includes a $6d$ de Sitter vacuum ($dS_6$), a number of $dS_4 times S_2$ and $AdS_4 times S_2$ vacua, and a number of $AdS_2 times S_4$ vacua. We find that compactification transitions $dS_6 to AdS_2 times S_4$ occur through the nucleation of electrically charged black hole pairs, and transitions from $dS_6$ to $dS_4 times S_2$ and $AdS_4 times S_2$ occur through the nucleation of magnetically charged spherical black branes. We identify the appropriate instantons and describe the spacetime structure resulting from brane nucleation.