بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

M-estimation in high-dimensional linear model

187 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kai Wang

تاريخ النشر 2018

مجال البحث

والبحث باللغة English

تأليف Kai Wang - Yanling Zhu

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We mainly study the M-estimation method for the high-dimensional linear regression model, and discuss the properties of M-estimator when the penalty term is the local linear approximation. In fact, M-estimation method is a framework, which covers the methods of the least absolute deviation, the quantile regression, least squares regression and Huber regression. We show that the proposed estimator possesses the good properties by applying certain assumptions. In the part of numerical simulation, we select the appropriate algorithm to show the good robustness of this method

قيم البحث

419 - Z. Bai , D. Jiang , J. Yao 2012

For a multivariate linear model, Wilks likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative requires complex analytic approximations and more importantly, th ese distributional approximations are feasible only for moderate dimension of the dependent variable, say $ple 20$. On the other hand, assuming that the data dimension $p$ as well as the number $q$ of regression variables are fixed while the sample size $n$ grows, several asymptotic approximations are proposed in the literature for Wilks $bLa$ including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilks test in a high-dimensional context, specifically assuming a high data dimension $p$ and a large sample size $n$. Based on recent random matrix theory, the correction we propose to Wilks test is asymptotically Gaussian under the null and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large $p$ and large $n$ context, but also for moderately large data dimensions like $p=30$ or $p=50$. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in MANOVA which is valid for high-dimensional data.

المنهجية نظرية الإحصاء نظرية الإحصاء

High-dimensional changepoint estimation with heterogeneous missingness

384 - Bertille Follain , Tengyao Wang , Richard J. Samworth 2021

We propose a new method for changepoint estimation in partially-observed, high-dimensional time series that undergo a simultaneous change in mean in a sparse subset of coordinates. Our first methodological contribution is to introduce a MissCUSUM tra nsformation (a generalisation of the popular Cumulative Sum statistics), that captures the interaction between the signal strength and the level of missingness in each coordinate. In order to borrow strength across the coordinates, we propose to project these MissCUSUM statistics along a direction found as the solution to a penalised optimisation problem tailored to the specific sparsity structure. The changepoint can then be estimated as the location of the peak of the absolute value of the projected univariate series. In a model that allows different missingness probabilities in different component series, we identify that the key interaction between the missingness and the signal is a weighted sum of squares of the signal change in each coordinate, with weights given by the observation probabilities. More specifically, we prove that the angle between the estimated and oracle projection directions, as well as the changepoint location error, are controlled with high probability by the sum of two terms, both involving this weighted sum of squares, and representing the error incurred due to noise and the error due to missingness respectively. A lower bound confirms that our changepoint estimator, which we call MissInspect, is optimal up to a logarithmic factor. The striking effectiveness of the MissInspect methodology is further demonstrated both on simulated data, and on an oceanographic data set covering the Neogene period.

المنهجية نظرية الإحصاء نظرية الإحصاء

High-dimensional Central Limit Theorems by Steins Method

127 - Xiao Fang , Yuta Koike 2020

We obtain explicit error bounds for the $d$-dimensional normal approximation on hyperrectangles for a random vector that has a Stein kernel, or admits an exchangeable pair coupling, or is a non-linear statistic of independent random variables or a su m of $n$ locally dependent random vectors. We assume the approximating normal distribution has a non-singular covariance matrix. The error bounds vanish even when the dimension $d$ is much larger than the sample size $n$. We prove our main results using the approach of Gotze (1991) in Steins method, together with modifications of an estimate of Anderson, Hall and Titterington (1998) and a smoothing inequality of Bhattacharya and Rao (1976). For sums of $n$ independent and identically distributed isotropic random vectors having a log-concave density, we obtain an error bound that is optimal up to a $log n$ factor. We also discuss an application to multiple Wiener-It^{o} integrals.

الاحتمالات نظرية الإحصاء نظرية الإحصاء

Limiting behavior of largest entry of random tensor constructed by high-dimensional data

67 - Tiefeng Jiang , Junshan Xie 2019

Let ${X}_{k}=(x_{k1}, cdots, x_{kp}), k=1,cdots,n$, be a random sample of size $n$ coming from a $p$-dimensional population. For a fixed integer $mgeq 2$, consider a hypercubic random tensor $mathbf{{T}}$ of $m$-th order and rank $n$ with begin{eqnar ray*} mathbf{{T}}= sum_{k=1}^{n}underbrace{{X}_{k}otimescdotsotimes {X}_{k}}_{m~multiple}=Big(sum_{k=1}^{n} x_{ki_{1}}x_{ki_{2}}cdots x_{ki_{m}}Big)_{1leq i_{1},cdots, i_{m}leq p}. end{eqnarray*} Let $W_n$ be the largest off-diagonal entry of $mathbf{{T}}$. We derive the asymptotic distribution of $W_n$ under a suitable normalization for two cases. They are the ultra-high dimension case with $ptoinfty$ and $log p=o(n^{beta})$ and the high-dimension case with $pto infty$ and $p=O(n^{alpha})$ where $alpha,beta>0$. The normalizing constant of $W_n$ depends on $m$ and the limiting distribution of $W_n$ is a Gumbel-type distribution involved with parameter $m$.

الاحتمالات نظرية الإحصاء نظرية الإحصاء

Variance Estimation and Confidence Intervals from High-dimensional Genome-wide Association Studies Through Misspecified Mixed Model Analysis

113 - Cecilia Dao , Jiming Jiang , Debashis Paul 2021

We study variance estimation and associated confidence intervals for parameters characterizing genetic effects from genome-wide association studies (GWAS) misspecified mixed model analysis. Previous studies have shown that, in spite of the model miss pecification, certain quantities of genetic interests are estimable, and consistent estimators of these quantities can be obtained using the restricted maximum likelihood (REML) method under a misspecified linear mixed model. However, the asymptotic variance of such a REML estimator is complicated and not ready to be implemented for practical use. In this paper, we develop practical and computationally convenient methods for estimating such asymptotic variances and constructing the associated confidence intervals. Performance of the proposed methods is evaluated empirically based on Monte-Carlo simulations and real-data application.

المنهجية نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الوادي الدولية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

M-estimation in high-dimensional linear model

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً