بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Localizing Changes in High-Dimensional Regression Models

100 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daren Wang

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Alessandro Rinaldo - Daren Wang - Qin Wen

المنهجية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper addresses the problem of localizing change points in high-dimensional linear regression models with piecewise constant regression coefficients. We develop a dynamic programming approach to estimate the locations of the change points whose performance improves upon the current state-of-the-art, even as the dimensionality, the sparsity of the regression coefficients, the temporal spacing between two consecutive change points, and the magnitude of the difference of two consecutive regression coefficient vectors are allowed to vary with the sample size. Furthermore, we devise a computationally-efficient refinement procedure that provably reduces the localization error of preliminary estimates of the change points. We demonstrate minimax lower bounds on the localization error that nearly match the upper bound on the localization error of our methodology and show that the signal-to-noise condition we impose is essentially the weakest possible based on information-theoretic arguments. Extensive numerical results support our theoretical findings, and experiments on real air quality data reveal change points supported by historical information not used by the algorithm.

قيم البحث

406 - Jingfei Zhang , Yi Li 2020

Though Gaussian graphical models have been widely used in many scientific fields, limited progress has been made to link graph structures to external covariates because of substantial challenges in theory and computation. We propose a Gaussian graphi cal regression model, which regresses both the mean and the precision matrix of a Gaussian graphical model on covariates. In the context of co-expression quantitative trait locus (QTL) studies, our framework facilitates estimation of both population- and subject-level gene regulatory networks, and detection of how subject-level networks vary with genetic variants and clinical conditions. Our framework accommodates high dimensional responses and covariates, and encourages covariate effects on both the mean and the precision matrix to be sparse. In particular for the precision matrix, we stipulate simultaneous sparsity, i.e., group sparsity and element-wise sparsity, on effective covariates and their effects on network edges, respectively. We establish variable selection consistency first under the case with known mean parameters and then a more challenging case with unknown means depending on external covariates, and show in both cases that the convergence rate of the estimated precision parameters is faster than that obtained by lasso or group lasso, a desirable property for the sparse group lasso estimation. The utility and efficacy of our proposed method is demonstrated through simulation studies and an application to a co-expression QTL study with brain cancer patients.

المنهجية نظرية الإحصاء نظرية الإحصاء

On the Beta Prime Prior for Scale Parameters in High-Dimensional Bayesian Regression Models

126 - Ray Bai , Malay Ghosh 2018

We study high-dimensional Bayesian linear regression with a general beta prime distribution for the scale parameter. Under the assumption of sparsity, we show that appropriate selection of the hyperparameters in the beta prime prior leads to the (nea r) minimax posterior contraction rate when $p gg n$. For finite samples, we propose a data-adaptive method for estimating the hyperparameters based on marginal maximum likelihood (MML). This enables our prior to adapt to both sparse and dense settings, and under our proposed empirical Bayes procedure, the MML estimates are never at risk of collapsing to zero. We derive efficient Monte Carlo EM and variational EM algorithms for implementing our model, which are available in the R package NormalBetaPrime. Simulations and analysis of a gene expression data set illustrate our models self-adaptivity to varying levels of sparsity and signal strengths.

المنهجية

Inference for High-dimensional Maximin Effects in Heterogeneous Regression Models Using a Sampling Approach

266 - Zijian Guo 2020

Heterogeneity is an important feature of modern data sets and a central task is to extract information from large-scale and heterogeneous data. In this paper, we consider multiple high-dimensional linear models and adopt the definition of maximin eff ect (Meinshausen, B{u}hlmann, AoS, 43(4), 1801--1830) to summarize the information contained in this heterogeneous model. We define the maximin effect for a targeted population whose covariate distribution is possibly different from that of the observed data. We further introduce a ridge-type maximin effect to simultaneously account for reward optimality and statistical stability. To identify the high-dimensional maximin effect, we estimate the regression covariance matrix by a debiased estimator and use it to construct the aggregation weights for the maximin effect. A main challenge for statistical inference is that the estimated weights might have a mixture distribution and the resulted maximin effect estimator is not necessarily asymptotic normal. To address this, we devise a novel sampling approach to construct the confidence interval for any linear contrast of high-dimensional maximin effects. The coverage and precision properties of the proposed confidence interval are studied. The proposed method is demonstrated over simulations and a genetic data set on yeast colony growth under different environments.

المنهجية نظرية الإحصاء التعلم الالي

Detecting Abrupt Changes in High-Dimensional Self-Exciting Poisson Processes

116 - Daren Wang , Yi Yu , Rebecca Willett 2020

High-dimensional self-exciting point processes have been widely used in many application areas to model discrete event data in which past and current events affect the likelihood of future events. In this paper, we are concerned with detecting abrupt changes of the coefficient matrices in discrete-time high-dimensional self-exciting Poisson processes, which have yet to be studied in the existing literature due to both theoretical and computational challenges rooted in the non-stationary and high-dimensional nature of the underlying process. We propose a penalized dynamic programming approach which is supported by a theoretical rate analysis and numerical evidence.

المنهجية نظرية الإحصاء نظرية الإحصاء

Post-Lasso Inference for High-Dimensional Regression

326 - X. Jessie Jeng , Huimin Peng , Wenbin Lu 2018

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this paper, we c onsider variable selection from a new perspective motivated by the frequently occurred phenomenon that relevant variables are not completely distinguishable from noise variables on the solution path. We propose to characterize the positions of the first noise variable and the last relevant variable on the path. We then develop a new variable selection procedure to control over-selection of the noise variables ranking after the last relevant variable, and, at the same time, retain a high proportion of relevant variables ranking before the first noise variable. Our procedure utilizes the recently developed covariance test statistic and Q statistic in post-selection inference. In numerical examples, our method compares favorably with other existing methods in selection accuracy and the ability to interpret its results.

المنهجية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة المستنصرية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Localizing Changes in High-Dimensional Regression Models

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً