Incorporating Knowledge into Structural Equation Models using Auxiliary Variables

105 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bryant Chen

تاريخ النشر 2015

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Bryant Chen - Judea Pearl - Elias Bareinboim

المنهجية الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we extend graph-based identification methods by allowing background knowledge in the form of non-zero parameter values. Such information could be obtained, for example, from a previously conducted randomized experiment, from substantive understanding of the domain, or even an identification technique. To incorporate such information systematically, we propose the addition of auxiliary variables to the model, which are constructed so that certain paths will be conveniently cancelled. This cancellation allows the auxiliary variables to help conventional methods of identification (e.g., single-door criterion, instrumental variables, half-trek criterion), as well as model testing (e.g., d-separation, over-identification). Moreover, by iteratively alternating steps of identification and adding auxiliary variables, we can improve the power of existing identification methods via a bootstrapping approach that does not require external knowledge. We operationalize this method for simple instrumental sets (a generalization of instrumental variables) and show that the resulting method is able to identify at least as many models as the most general identification method for linear systems known to date. We further discuss the application of auxiliary variables to the tasks of model testing and z-identification.

قيم البحث

170 - Bryant Chen , Daniel Kumor , Elias Bareinboim 2016

We developed a novel approach to identification and model testing in linear structural equation models (SEMs) based on auxiliary variables (AVs), which generalizes a widely-used family of methods known as instrumental variables. The identification pr oblem is concerned with the conditions under which causal parameters can be uniquely estimated from an observational, non-causal covariance matrix. In this paper, we provide an algorithm for the identification of causal parameters in linear structural models that subsumes previous state-of-the-art methods. In other words, our algorithm identifies strictly more coefficients and models than methods previously known in the literature. Our algorithm builds on a graph-theoretic characterization of conditional independence relations between auxiliary and model variables, which is developed in this paper. Further, we leverage this new characterization for allowing identification when limited experimental data or new substantive knowledge about the domain is available. Lastly, we develop a new procedure for model testing using AVs.

المنهجية

Identification of Causal Effects Within Principal Strata Using Auxiliary Variables

158 - Zhichao Jiang , Peng Ding 2020

In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome, in which the principal strata are defined by the joint potential values of the intermediate variab le. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for identification and estimation of the principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly-used strategies for principal stratification problems: principal ignorability, and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. For these two strategies, we give non-parametric and semi-parametric identification results without modeling assumptions on the outcome. When the assumptions for neither strategies are plausible, we propose a large class of flexible parametric and semi-parametric models for identifying principal causal effects. Our theory not only establishes formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.

المنهجية

Incorporating Structural Stigma into Network Analysis

100 - Francis Lee , Carter T. Butts 2019

A rich literature has explored the modeling of homophily and other forms of nonuniform mixing associated with individual-level covariates within the exponential family random graph (ERGM) framework. Such differential mixing does not fully explain phe nomena such as stigma, however, which involve the active maintenance of social boundaries by ostracism of persons with out-group ties. Here, we introduce a new statistic that allows for such effects to be captured, making it possible to probe for the potential presence of boundary maintenance above and beyond simple differences in nomination rates. We demonstrate this statistic in the context of gender segregation in a school classroom.

الشبكات الاجتماعية والمعلومات الرياضيات المتقطعة المنهجية

Incorporating domain knowledge into neural-guided search

69 - Brenden K. Petersen , Claudio P. Santiago , Mikel Landajuela Larma 2021

Many AutoML problems involve optimizing discrete objects under a black-box reward. Neural-guided search provides a flexible means of searching these combinatorial spaces using an autoregressive recurrent neural network. A major benefit of this approa ch is that builds up objects sequentially--this provides an opportunity to incorporate domain knowledge into the search by directly modifying the logits emitted during sampling. In this work, we formalize a framework for incorporating such in situ priors and constraints into neural-guided search, and provide sufficient conditions for enforcing constraints. We integrate several priors and constraints from existing works into this framework, propose several new ones, and demonstrate their efficacy in informing the task of symbolic regression.

التعلم الآلي

Data Hunches: Incorporating Personal Knowledge into Visualizations

250 - Haihan Lin , Derya Akbaba , Miriah Meyer 2021

The trouble with data is that often it provides only an imperfect representation of the phenomenon of interest. When reading and interpreting data, personal knowledge about the data plays an important role. Data visualization, however, has neither a concept defining personal knowledge about datasets, nor the methods or tools to robustly integrate them into an analysis process, thus hampering analysts ability to express their personal knowledge about datasets, and others to learn from such knowledge. In this work, we define such personal knowledge about datasets as data hunches and elevate this knowledge to another form of data that can be externalized, visualized, and used for collaboration. We establish the implications of data hunches and provide a design space for externalizing and communicating data hunches through visualization techniques. We envision such a design space will empower users to externalize their personal knowledge and support the ability to learn from others data hunches.

تفاعل الإنسان والحاسوب