Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Deciding the dimension of effective dimension reduction space for functional and high-dimensional data

479 0 0.0 ( 0 )

Download Cite

Added by Yehua Li

Publication date 2010

fields Mathematical Statistics

and research's language is English

Authors Yehua Li - Tailen Hsing

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we consider regression models with a Hilbert-space-valued predictor and a scalar response, where the response depends on the predictor only through a finite number of projections. The linear subspace spanned by these projections is called the effective dimension reduction (EDR) space. To determine the dimensionality of the EDR space, we focus on the leading principal component scores of the predictor, and propose two sequential $chi^2$ testing procedures under the assumption that the predictor has an elliptically contoured distribution. We further extend these procedures and introduce a test that simultaneously takes into account a large number of principal component scores. The proposed procedures are supported by theory, validated by simulation studies, and illustrated by a real-data example. Our methods and theory are applicable to functional data and high-dimensional multivariate data.

rate research

Selection of variables and dimension reduction in high-dimensional non-parametric regression

415 - Karine Bertin , Guillaume Lecue 2008

We consider a $l_1$-penalization procedure in the non-parametric Gaussian regression model. In many concrete examples, the dimension $d$ of the input variable $X$ is very large (sometimes depending on the number of observations). Estimation of a $beta$-regular regression function $f$ cannot be faster than the slow rate $n^{-2beta/(2beta+d)}$. Hopefully, in some situations, $f$ depends only on a few numbers of the coordinates of $X$. In this paper, we construct two procedures. The first one selects, with high probability, these coordinates. Then, using this subset selection method, we run a local polynomial estimator (on the set of interesting coordinates) to estimate the regression function at the rate $n^{-2beta/(2beta+d^*)}$, where $d^*$, the real dimension of the problem (exact number of variables whom $f$ depends on), has replaced the dimension $d$ of the design. To achieve this result, we used a $l_1$ penalization method in this non-parametric setup.

Statistics Theory Statistics Theory

An RKHS formulation of the inverse regression dimension-reduction problem

332 - Tailen Hsing , Haobo Ren 2009

Suppose that $Y$ is a scalar and $X$ is a second-order stochastic process, where $Y$ and $X$ are conditionally independent given the random variables $xi_1,...,xi_p$ which belong to the closed span $L_X^2$ of $X$. This paper investigates a unified framework for the inverse regression dimension-reduction problem. It is found that the identification of $L_X^2$ with the reproducing kernel Hilbert space of $X$ provides a platform for a seamless extension from the finite- to infinite-dimensional settings. It also facilitates convenient computational algorithms that can be applied to a variety of models.

Statistics Theory Statistics Theory

A unified framework for correlation mining in ultra-high dimension

113 - Alfred O. Hero , Bala Rajaratnam , Yun Wei 2021

An important problem in large scale inference is the identification of variables that have large correlations or partial correlations. Recent work has yielded breakthroughs in the ultra-high dimensional setting when the sample size $n$ is fixed and the dimension $p rightarrow infty$ ([Hero, Rajaratnam 2011, 2012]). Despite these advances, the correlation screening framework suffers from some serious practical, methodological and theoretical deficiencies. For instance, theoretical safeguards for partial correlation screening requires that the population covariance matrix be block diagonal. This block sparsity assumption is however highly restrictive in numerous practical applications. As a second example, results for correlation and partial correlation screening framework requires the estimation of dependence measures or functionals, which can be highly prohibitive computationally. In this paper, we propose a unifying approach to correlation and partial correlation mining which specifically goes beyond the block diagonal correlation structure, thus yielding a methodology that is suitable for modern applications. By making connections to random geometric graphs, the number of highly correlated or partial correlated variables are shown to have novel compound Poisson finite-sample characterizations, which hold for both the finite $p$ case and when $p rightarrow infty$. The unifying framework also demonstrates an important duality between correlation and partial correlation screening with important theoretical and practical consequences.

Statistics Theory Statistics Theory

Unsupervised Functional Data Analysis via Nonlinear Dimension Reduction

125 - Moritz Herrmann , Fabian Scheipl 2020

In recent years, manifold methods have moved into focus as tools for dimension reduction. Assuming that the high-dimensional data actually lie on or close to a low-dimensional nonlinear manifold, these methods have shown convincing results in several settings. This manifold assumption is often reasonable for functional data, i.e., data representing continuously observed functions, as well. However, the performance of manifold methods recently proposed for tabular or image data has not been systematically assessed in the case of functional data yet. Moreover, it is unclear how to evaluate the quality of learned embeddings that do not yield invertible mappings, since the reconstruction error cannot be used as a performance measure for such representations. In this work, we describe and investigate the specific challenges for nonlinear dimension reduction posed by the functional data setting. The contributions of the paper are three-fold: First of all, we define a theoretical framework which allows to systematically assess specific challenges that arise in the functional data context, transfer several nonlinear dimension reduction methods for tabular and image data to functional data, and show that manifold methods can be used successfully in this setting. Secondly, we subject performance assessment and tuning strategies to a thorough and systematic evaluation based on several different functional data settings and point out some previously undescribed weaknesses and pitfalls which can jeopardize reliable judgment of embedding quality. Thirdly, we propose a nuanced approach to make trustworthy decisions for or against competing nonconforming embeddings more objectively.

Machine Learning Machine Learning

Dimension Reduction for the Hyperbolic Space

336 - itai benjamini , Yury Makarychev 2007

A dimension reduction for the hyperbolic space is established. When points are far apart an embedding with bounded distortion into the hyperbolic plane is achieved.

Metric Geometry

comments

Fetching comments

Institut National d'Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Deciding the dimension of effective dimension reduction space for functional and high-dimensional data

Ask ChatGPT about the research

No Arabic abstract

Read More