Do you want to publish a course? Click here

Double Machine Learning and Bad Controls -- A Cautionary Tale

98   0   0.0 ( 0 )
 Added by Paul H\\\"unermund
 Publication date 2021
  fields Economy
and research's language is English




Ask ChatGPT about the research

Double machine learning (DML) is becoming an increasingly popular tool for automated model selection in high-dimensional settings. At its core, DML assumes unconfoundedness, or exogeneity of all considered controls, which might likely be violated if the covariate space is large. In this paper, we lay out a theory of bad controls building on the graph-theoretic approach to causality. We then demonstrate, based on simulation studies and an application to real-world data, that DML is very sensitive to the inclusion of bad controls and exhibits considerable bias even with only a few endogenous variables present in the conditioning set. The extent of this bias depends on the precise nature of the assumed causal model, which calls into question the ability of selecting appropriate controls for regressions in a purely data-driven way.



rate research

Read More

Resolution studies of test problems set baselines and help define minimum resolution requirements, however, resolution studies must also be performed on scientific simulations to determine the effect of resolution on the specific scientific results. We perform a resolution study on the formation of a protostar by modelling the collapse of gas through 14 orders of magnitude in density. This is done using compressible radiative non-ideal magnetohydrodynamics. Our suite consists of an ideal magnetohydrodynamics (MHD) model and two non-ideal MHD models, and we test three resolutions for each model. The resulting structure of the ideal MHD model is approximately independent of resolution, although higher magnetic field strengths are realised in higher resolution models. The non-ideal MHD models are more dependent on resolution, specifically the magnetic field strength and structure. Stronger magnetic fields are realised in higher resolution models, and the evolution of detailed structures such as magnetic walls are only resolved in our highest resolution simulation. In several of the non-ideal MHD models, there is an off-set between the location of the maximum magnetic field strength and the maximum density, which is often obscured or lost at lower resolutions. Thus, understanding the effects of resolution on numerical star formation is imperative for understanding the formation of a star.
84 - A. Pastore , M. Carnini 2020
We present three different methods to estimate error bars on the predictions made using a neural network. All of them represent lower bounds for the extrapolation errors. For example, we did not include an analysis on robustness against small perturbations of the input data. At first, we illustrate the methods through a simple toy model, then, we apply them to some realistic cases related to nuclear masses. By using theoretical data simulated either with a liquid-drop model or a Skyrme energy density functional, we benchmark the extrapolation performance of the neural network in regions of the Segr`e chart far away from the ones used for the training and validation. Finally, we discuss how error bars can help identifying when the extrapolation becomes too uncertain and thus unreliable
Analysis of cluster and field star uvby data demonstrates the existence of a previously undetected discrepancy in a widely used photometric metallicity calibration for G dwarfs. The discrepancy is systematic and strongly color-dependent, reducing the estimated [Fe/H] for stars above [Fe/H] ~ -0.2 by between +0.1 and +0.4 dex, and creating a deficit of metal-rich stars among dwarfs of mid-G and later spectral type. The source of the problem, triggered for stars with b-y greater than about 0.47, appears to be an enhanced metallicity dependence for the c1 index that increases as temperature declines. The link between c1, normally a surface gravity indicator, and metallicity produces two secondary effects. The deficit in the photometric abundance for a cool dwarf is partially compensated by some degree of evolution off the main sequence and cool dwarfs with metallicities significantly above the Hyades are found to have c1 indices that classify them as giants. The potential impact of the problem on stellar population studies is discussed.
We propose a practical and robust method for making inferences on average treatment effects estimated by synthetic controls. We develop a $K$-fold cross-fitting procedure for bias-correction. To avoid the difficult estimation of the long-run variance, inference is based on a self-normalized $t$-statistic, which has an asymptotically pivotal $t$-distribution. Our $t$-test is easy to implement, provably robust against misspecification, valid with non-stationary data, and demonstrates an excellent small sample performance. Compared to difference-in-differences, our method often yields more than 50% shorter confidence intervals and is robust to violations of parallel trends assumptions. An R-package for implementing our methods is available.
72 - M. Molina 2019
The attenuation of light in star forming galaxies is correlated with a multitude of physical parameters including star formation rate, metallicity and total dust content. This variation in attenuation is even more prevalent on the kiloparsec scale, which is relevant to many current spectroscopic integral field unit surveys. To understand the cause of this variation, we present and analyse textit{Swift}/UVOT near-UV (NUV) images and SDSS/MaNGA emission-line maps of 29 nearby ($z<0.084$) star forming galaxies. We resolve kiloparsec-sized star forming regions within the galaxies and compare their optical nebular attenuation (i.e., the Balmer emission line optical depth, $tau^l_Bequivtau_{textrm{H}beta}-tau_{textrm{H}alpha}$) and NUV stellar continuum attenuation (via the NUV power-law index, $beta$) to the attenuation law described by Battisti et al. The data agree with that model, albeit with significant scatter. We explore the dependence of the scatter of the $beta$-$tau^l_B$ measurements from the star forming regions on different physical parameters, including distance from the nucleus, star formation rate and total dust content. Finally, we compare the measured $tau^l_B$ and $beta$ between the individual star forming regions and the integrated galaxy light. We find a strong variation in $beta$ between the kiloparsec scale and the larger galaxy scale not seen in $tau^l_B$. We conclude that the sight-line dependence of UV attenuation and the reddening of $beta$ due to the light from older stellar populations could contribute to the $beta$-$tau^l_B$ discrepancy.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا