مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Scalable Interpretable Learning for Multi-Response Error-in-Variables Regression

108 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jie Wu

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف J. Wu - Z. Zheng - Y. Li

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Corrupted data sets containing noisy or missing observations are prevalent in various contemporary applications such as economics, finance and bioinformatics. Despite the recent methodological and algorithmic advances in high-dimensional multi-response regression, how to achieve scalable and interpretable estimation under contaminated covariates is unclear. In this paper, we develop a new methodology called convex conditioned sequential sparse learning (COSS) for error-in-variables multi-response regression under both additive measurement errors and random missing data. It combines the strengths of the recently developed sequential sparse factor regression and the nearest positive semi-definite matrix projection, thus enjoying stepwise convexity and scalability in large-scale association analyses. Comprehensive theoretical guarantees are provided and we demonstrate the effectiveness of the proposed methodology through numerical studies.

قيم البحث

127 - Swarnadip Ghosh , Trevor Hastie , Art B. Owen 2021

The cost of both generalized least squares (GLS) and Gibbs sampling in a crossed random effects model can easily grow faster than $N^{3/2}$ for $N$ observations. Ghosh et al. (2020) develop a backfitting algorithm that reduces the cost to $O(N)$. Her e we extend that method to a generalized linear mixed model for logistic regression. We use backfitting within an iteratively reweighted penalized least square algorithm. The specific approach is a version of penalized quasi-likelihood due to Schall (1991). A straightforward version of Schalls algorithm would also cost more than $N^{3/2}$ because it requires the trace of the inverse of a large matrix. We approximate that quantity at cost $O(N)$ and prove that this substitution makes an asymptotically negligible difference. Our backfitting algorithm also collapses the fixed effect with one random effect at a time in a way that is analogous to the collapsed Gibbs sampler of Papaspiliopoulos et al. (2020). We use a symmetric operator that facilitates efficient covariance computation. We illustrate our method on a real dataset from Stitch Fix. By properly accounting for crossed random effects we show that a naive logistic regression could underestimate sampling variances by several hundred fold.

المنهجية نظرية الإحصاء حساب

Recycled Least Squares Estimation in Nonlinear Regression

87 - Ben Boukai , Yue Zhang 2018

We consider a resampling scheme for parameters estimates in nonlinear regression models. We provide an estimation procedure which recycles, via random weighting, the relevant parameters estimates to construct consistent estimates of the sampling dist ribution of the various estimates. We establish the asymptotic normality of the resampled estimates and demonstrate the applicability of the recycling approach in a small simulation study and via example.

المنهجية نظرية الإحصاء تطبيقات الإحصاء

Anisotropic local constant smoothing for change-point regression function estimation

178 - John R.J. Thompson , W. John Braun 2020

Understanding forest fire spread in any region of Canada is critical to promoting forest health, and protecting human life and infrastructure. Quantifying fire spread from noisy images, where regions of a fire are separated by change-point boundaries , is critical to faithfully estimating fire spread rates. In this research, we develop a statistically consistent smooth estimator that allows us to denoise fire spread imagery from micro-fire experiments. We develop an anisotropic smoothing method for change-point data that uses estimates of the underlying data generating process to inform smoothing. We show that the anisotropic local constant regression estimator is consistent with convergence rate $Oleft(n^{-1/{(q+2)}}right)$. We demonstrate its effectiveness on simulated one- and two-dimensional change-point data and fire spread imagery from micro-fire experiments.

المنهجية نظرية الإحصاء تطبيقات الإحصاء

Parallel integrative learning for large-scale multi-response regression with incomplete outcomes

74 - Ruipeng Dong , Daoji Li , Zemin Zheng 2021

Multi-task learning is increasingly used to investigate the association structure between multiple responses and a single set of predictor variables in many applications. In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction, and computation. In this paper, we propose a scalable and computationally efficient procedure, called PEER, for large-scale multi-response regression with incomplete outcomes, where both the numbers of responses and predictors can be high-dimensional. Motivated by sparse factor regression, we convert the multi-response regression into a set of univariate-response regressions, which can be efficiently implemented in parallel. Under some mild regularity conditions, we show that PEER enjoys nice sampling properties including consistency in estimation, prediction, and variable selection. Extensive simulation studies show that our proposal compares favorably with several existing methods in estimation accuracy, variable selection, and computation efficiency.

المنهجية تطبيقات الإحصاء حساب

R-optimal designs for multi-response regression models with multi-factors

254 - Pengqi Liu , Lucy Gao , 2019

We investigate R-optimal designs for multi-response regression models with multi-factors, where the random errors in these models are correlated. Several theoretical results are derived for Roptimal designs, including scale invariance, reflection sym metry, line and plane symmetry, and dependence on the covariance matrix of the errors. All the results can be applied to linear and nonlinear models. In addition, an efficient algorithm based on an interior point method is developed for finding R-optimal designs on discrete design spaces. The algorithm is very flexible, and can be applied to any multi-response regression model.

المنهجية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الحواش الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Scalable Interpretable Learning for Multi-Response Error-in-Variables Regression

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً