Structural Change in Sparsity

99 0 0.0 ( 0 )

Download Cite

Added by Youngki Shin Youngki Shin

Publication date 2014

fields Mathematical Statistics

and research's language is English

Authors Sokbae Lee - Yuan Liao - Myung Hwan Seo

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In the high-dimensional sparse modeling literature, it has been crucially assumed that the sparsity structure of the model is homogeneous over the entire population. That is, the identities of important regressors are invariant across the population and across the individuals in the collected sample. In practice, however, the sparsity structure may not always be invariant in the population, due to heterogeneity across different sub-populations. We consider a general, possibly non-smooth M-estimation framework, allowing a possible structural change regarding the identities of important regressors in the population. Our penalized M-estimator not only selects covariates but also discriminates between a model with homogeneous sparsity and a model with a structural change in sparsity. As a result, it is not necessary to know or pretest whether the structural change is present, or where it occurs. We derive asymptotic bounds on the estimation loss of the penalized M-estimators, and achieve the oracle properties. We also show that when there is a structural change, the estimator of the threshold parameter is super-consistent. If the signal is relatively strong, the rates of convergence can be further improved and asymptotic distributional properties of the estimators including the threshold estimator can be established using an adaptive penalization. The proposed methods are then applied to quantile regression and logistic regression models and are illustrated via Monte Carlo experiments.

rate research

Estimation of high-dimensional change-points under a group sparsity structure

190 - Hanqing Cai , Tengyao Wang 2021

Change-points are a routine feature of big data observed in the form of high-dimensional data streams. In many such data streams, the component series possess group structures and it is natural to assume that changes only occur in a small number of all groups. We propose a new change point procedure, called groupInspect, that exploits the group sparsity structure to estimate a projection direction so as to aggregate information across the component series to successfully estimate the change-point in the mean structure of the series. We prove that the estimated projection direction is minimax optimal, up to logarithmic factors, when all group sizes are of comparable order. Moreover, our theory provide strong guarantees on the rate of convergence of the change-point location estimator. Numerical studies demonstrates the competitive performance of groupInspect in a wide range of settings and a real data example confirms the practical usefulness of our procedure.

Methodology Statistics Theory Statistics Theory

Bootstrapping Structural Change Tests

47 - Otilia Boldea 2018

This paper analyses the use of bootstrap methods to test for parameter change in linear models estimated via Two Stage Least Squares (2SLS). Two types of test are considered: one where the null hypothesis is of no change and the alternative hypothesis involves discrete change at k unknown break-points in the sample; and a second test where the null hypothesis is that there is discrete parameter change at l break-points in the sample against an alternative in which the parameters change at l + 1 break-points. In both cases, we consider inferences based on a sup-Wald-type statistic using either the wild recursive bootstrap or the wild fixed bootstrap. We establish the asymptotic validity of these bootstrap tests under a set of general conditions that allow the errors to exhibit conditional and/or unconditional heteroskedasticity, and report results from a simulation study that indicate the tests yield reliable inferences in the sample sizes often encountered in macroeconomics. The analysis covers the cases where the first-stage estimation of 2SLS involves a model whose parameters are either constant or themselves subject to discrete parameter change. If the errors exhibit unconditional heteroskedasticity and/or the reduced form is unstable then the bootstrap methods are particularly attractive because the limiting distributions of the test statistics are not pivotal.

Econometrics

246 - Yuan Huang , Qingzhao Zhang , Sanguo Zhang 2015

For data with high-dimensional covariates but small to moderate sample sizes, the analysis of single datasets often generates unsatisfactory results. The integrative analysis of multiple independent datasets provides an effective way of pooling information and outperforms single-dataset analysis and some alternative multi-datasets approaches including meta-analysis. Under certain scenarios, multiple datasets are expected to share common important covariates, that is, the multiple models have similarity in sparsity structures. However, the existing methods do not have a mechanism to {it promote} the similarity of sparsity structures in integrative analysis. In this study, we consider penalized variable selection and estimation in integrative analysis. We develop an $L_0$-penalty based approach, which is the first to explicitly promote the similarity of sparsity structures. Computationally it is realized using a coordinate descent algorithm. Theoretically it has the much desired consistency properties. In simulation, it significantly outperforms the competing alternative when the models in multiple datasets share common important covariates. It has better or similar performance as the alternative when the sparsity structures share no similarity. Thus it provides a safe choice for data analysis. Applying the proposed method to three lung cancer datasets with gene expression measurements leads to models with significantly more similar sparsity structures and better prediction performance.

Methodology

Why did the distribution change?

344 - Kailash Budhathoki , Dominik Janzing , Patrick Bloebaum 2021

We describe a formal approach based on graphical causal models to identify the root causes of the change in the probability distribution of variables. After factorizing the joint distribution into conditional distributions of each variable, given its parents (the causal mechanisms), we attribute the change to changes of these causal mechanisms. This attribution analysis accounts for the fact that mechanisms often change independently and sometimes only some of them change. Through simulations, we study the performance of our distribution change attribution method. We then present a real-world case study identifying the drivers of the difference in the income distribution between men and women.

Methodology Artificial Intelligence Machine Learning

Experimentation for Homogenous Policy Change

130 - Molly Offer-Westort , Drew Dimmery 2021

When the Stable Unit Treatment Value Assumption (SUTVA) is violated and there is interference among units, there is not a uniquely defined Average Treatment Effect (ATE), and alternative estimands may be of interest, among them average unit-level differences in outcomes under different homogeneous treatment policies. We term this target the Homogeneous Assignment Average Treatment Effect (HAATE). We consider approaches to experimental design with multiple treatment conditions under partial interference and, given the estimand of interest, we show that difference-in-means estimators may perform better than correctly specified regression models in finite samples on root mean squared error (RMSE). With errors correlated at the cluster level, we demonstrate that two-stage randomization procedures with intra-cluster correlation of treatment strictly between zero and one may dominate one-stage randomization designs on the same metric. Simulations demonstrate performance of this approach; an application to online experiments at Facebook is discussed.

Methodology Applications