Sliced Inverse Regression for Spatial Data

237 0 0.0 ( 0 )

Download Cite

Added by Christoph Muehlmann

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors Christoph Muehlmann - Hannu Oja - Klaus Nordhausen

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Sliced inverse regression is one of the most popular sufficient dimension reduction methods. Originally, it was designed for independent and identically distributed data and recently extend to the case of serially and spatially dependent data. In this work we extend it to the case of spatially dependent data where the response might depend also on neighbouring covariates when the observations are taken on a grid-like structure as it is often the case in econometric spatial regression applications. We suggest guidelines on how to decide upon the dimension of the subspace of interest and also which spatial lag might be of interest when modeling the response. These guidelines are supported by a conducted simulation study.

rate research

Sliced Inverse Moment Regression Using Weighted Chi-Squared Tests for Dimension Reduction

376 - Zhishen Ye , Jie Yang 2013

We propose a new method for dimension reduction in regression using the first two inverse moments. We develop corresponding weighted chi-squared tests for the dimension of the regression. The proposed method considers linear combinations of Sliced Inverse Regression (SIR) and the method using a new candidate matrix which is designed to recover the entire inverse second moment subspace. The optimal combination may be selected based on the p-values derived from the dimension tests. Theoretically, the proposed method, as well as Sliced Average Variance Estimate (SAVE), are more capable of recovering the complete central dimension reduction subspace than SIR and Principle Hessian Directions (pHd). Therefore it can substitute for SIR, pHd, SAVE, or any linear combination of them at a theoretical level. Simulation study indicates that the proposed method may have consistently greater power than SIR, pHd, and SAVE.

Methodology Statistics Theory Statistics Theory

Online Sparse Sliced Inverse Regression

119 - Haoyang Cheng , Wenquan Cui , Xu Jianjun 2020

Due to the demand for tackling the problem of streaming data with high dimensional covariates, we propose an online sparse sliced inverse regression (OSSIR) method for online sufficient dimension reduction. The existing online sufficient dimension reduction methods focus on the case when the dimension $p$ is small. In this article, we show that our method can achieve better statistical accuracy and computation speed when the dimension $p$ is large. There are two important steps in our method, one is to extend the online principal component analysis to iteratively obtain the eigenvalues and eigenvectors of the kernel matrix, the other is to use the truncated gradient to achieve online $L_{1}$ regularization. We also analyze the convergence of the extended Candid covariance-free incremental PCA(CCIPCA) and our method. By comparing several existing methods in the simulations and real data applications, we demonstrate the effectiveness and efficiency of our method.

Computation

Cluster-Based Regularized Sliced Inverse Regression for Forecasting Macroeconomic Variables

391 - Yue Yu , Zhihong Chen , Jie Yang 2011

This article concerns the dimension reduction in regression for large data set. We introduce a new method based on the sliced inverse regression approach, called cluster-based regularized sliced inverse regression. Our method not only keeps the merit of considering both response and predictors information, but also enhances the capability of handling highly correlated variables. It is justified under certain linearity conditions. An empirical application on a macroeconomic data set shows that our method has outperformed the dynamic factor model and other shrinkage methods.

Applications

Doubly Robust Regression Analysis for Data Fusion

155 - Katherine Evans , BaoLuo Sun , James Robins 2018

This paper investigates the problem of making inference about a parametric model for the regression of an outcome variable $Y$ on covariates $(V,L)$ when data are fused from two separate sources, one which contains information only on $(V, Y)$ while the other contains information only on covariates. This data fusion setting may be viewed as an extreme form of missing data in which the probability of observing complete data $(V,L,Y)$ on any given subject is zero. We have developed a large class of semiparametric estimators, which includes doubly robust estimators, of the regression coefficients in fused data. The proposed method is DR in that it is consistent and asymptotically normal if, in addition to the model of interest, we correctly specify a model for either the data source process under an ignorability assumption, or the distribution of unobserved covariates. We evaluate the performance of our various estimators via an extensive simulation study, and apply the proposed methods to investigate the relationship between net asset value and total expenditure among U.S. households in 1998, while controlling for potential confounders including income and other demographic variables.

Methodology

Regression Analysis of Correlations for Correlated Data

148 - Jie Hu 2021

Correlated data are ubiquitous in todays data-driven society. A fundamental task in analyzing these data is to understand, characterize and utilize the correlations in them in order to conduct valid inference. Yet explicit regression analysis of correlations has been so far limited to longitudinal data, a special form of correlated data, while implicit analysis via mixed-effects models lacks generality as a full inferential tool. This paper proposes a novel regression approach for modelling the correlation structure, leveraging a new generalized z-transformation. This transformation maps correlation matrices that are constrained to be positive definite to vectors with un-restricted support, and is order-invariant. Building on these two properties, we develop a regression model to relate the transformed parameters to any covariates. We show that coupled with a mean and a variance regression model, the use of maximum likelihood leads to asymptotically normal parameter estimates, and crucially enables statistical inference for all the parameters. The performance of our framework is demonstrated in extensive simulation. More importantly, we illustrate the use of our model with the analysis of the classroom data, a highly unbalanced multilevel clustered data with within-class and within-school correlations, and the analysis of the malaria immune response data in Benin, a longitudinal data with time-dependent covariates in addition to time. Our analyses reveal new insights not previously known.

Methodology