Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Scalable Interpretable Learning for Multi-Response Error-in-Variables Regression

108 0 0.0 ( 0 )

Download Cite

Added by Jie Wu

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors J. Wu - Z. Zheng - Y. Li

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Corrupted data sets containing noisy or missing observations are prevalent in various contemporary applications such as economics, finance and bioinformatics. Despite the recent methodological and algorithmic advances in high-dimensional multi-response regression, how to achieve scalable and interpretable estimation under contaminated covariates is unclear. In this paper, we develop a new methodology called convex conditioned sequential sparse learning (COSS) for error-in-variables multi-response regression under both additive measurement errors and random missing data. It combines the strengths of the recently developed sequential sparse factor regression and the nearest positive semi-definite matrix projection, thus enjoying stepwise convexity and scalability in large-scale association analyses. Comprehensive theoretical guarantees are provided and we demonstrate the effectiveness of the proposed methodology through numerical studies.

rate research

Scalable logistic regression with crossed random effects

127 - Swarnadip Ghosh , Trevor Hastie , Art B. Owen 2021

The cost of both generalized least squares (GLS) and Gibbs sampling in a crossed random effects model can easily grow faster than $N^{3/2}$ for $N$ observations. Ghosh et al. (2020) develop a backfitting algorithm that reduces the cost to $O(N)$. Here we extend that method to a generalized linear mixed model for logistic regression. We use backfitting within an iteratively reweighted penalized least square algorithm. The specific approach is a version of penalized quasi-likelihood due to Schall (1991). A straightforward version of Schalls algorithm would also cost more than $N^{3/2}$ because it requires the trace of the inverse of a large matrix. We approximate that quantity at cost $O(N)$ and prove that this substitution makes an asymptotically negligible difference. Our backfitting algorithm also collapses the fixed effect with one random effect at a time in a way that is analogous to the collapsed Gibbs sampler of Papaspiliopoulos et al. (2020). We use a symmetric operator that facilitates efficient covariance computation. We illustrate our method on a real dataset from Stitch Fix. By properly accounting for crossed random effects we show that a naive logistic regression could underestimate sampling variances by several hundred fold.

Methodology Statistics Theory Computation

Recycled Least Squares Estimation in Nonlinear Regression

87 - Ben Boukai , Yue Zhang 2018

We consider a resampling scheme for parameters estimates in nonlinear regression models. We provide an estimation procedure which recycles, via random weighting, the relevant parameters estimates to construct consistent estimates of the sampling distribution of the various estimates. We establish the asymptotic normality of the resampled estimates and demonstrate the applicability of the recycling approach in a small simulation study and via example.

Methodology Statistics Theory Applications

Anisotropic local constant smoothing for change-point regression function estimation

178 - John R.J. Thompson , W. John Braun 2020

Understanding forest fire spread in any region of Canada is critical to promoting forest health, and protecting human life and infrastructure. Quantifying fire spread from noisy images, where regions of a fire are separated by change-point boundaries, is critical to faithfully estimating fire spread rates. In this research, we develop a statistically consistent smooth estimator that allows us to denoise fire spread imagery from micro-fire experiments. We develop an anisotropic smoothing method for change-point data that uses estimates of the underlying data generating process to inform smoothing. We show that the anisotropic local constant regression estimator is consistent with convergence rate $Oleft(n^{-1/{(q+2)}}right)$. We demonstrate its effectiveness on simulated one- and two-dimensional change-point data and fire spread imagery from micro-fire experiments.

Methodology Statistics Theory Applications

Parallel integrative learning for large-scale multi-response regression with incomplete outcomes

74 - Ruipeng Dong , Daoji Li , Zemin Zheng 2021

Multi-task learning is increasingly used to investigate the association structure between multiple responses and a single set of predictor variables in many applications. In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction, and computation. In this paper, we propose a scalable and computationally efficient procedure, called PEER, for large-scale multi-response regression with incomplete outcomes, where both the numbers of responses and predictors can be high-dimensional. Motivated by sparse factor regression, we convert the multi-response regression into a set of univariate-response regressions, which can be efficiently implemented in parallel. Under some mild regularity conditions, we show that PEER enjoys nice sampling properties including consistency in estimation, prediction, and variable selection. Extensive simulation studies show that our proposal compares favorably with several existing methods in estimation accuracy, variable selection, and computation efficiency.

Methodology Applications Computation

R-optimal designs for multi-response regression models with multi-factors

254 - Pengqi Liu , Lucy Gao , 2019

We investigate R-optimal designs for multi-response regression models with multi-factors, where the random errors in these models are correlated. Several theoretical results are derived for Roptimal designs, including scale invariance, reflection symmetry, line and plane symmetry, and dependence on the covariance matrix of the errors. All the results can be applied to linear and nonlinear models. In addition, an efficient algorithm based on an interior point method is developed for finding R-optimal designs on discrete design spaces. The algorithm is very flexible, and can be applied to any multi-response regression model.

Methodology

comments

Fetching comments

AlHawash Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Scalable Interpretable Learning for Multi-Response Error-in-Variables Regression

Ask ChatGPT about the research

No Arabic abstract

Read More