New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Hypoelliptic Diffusion Maps I: Tangent Bundles

86 0 0.0 ( 0 )

Download Cite

Added by Tingran Gao

Publication date 2015

fields Mathematical Statistics

and research's language is English

Authors Tingran Gao

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We introduce the concept of Hypoelliptic Diffusion Maps (HDM), a framework generalizing Diffusion Maps in the context of manifold learning and dimensionality reduction. Standard non-linear dimensionality reduction methods (e.g., LLE, ISOMAP, Laplacian Eigenmaps, Diffusion Maps) focus on mining massive data sets using weighted affinity graphs; Orientable Diffusion Maps and Vector Diffusion Maps enrich these graphs by attaching to each node also some local geometry. HDM likewise considers a scenario where each node possesses additional structure, which is now itself of interest to investigate. Virtually, HDM augments the original data set with attached structures, and provides tools for studying and organizing the augmented ensemble. The goal is to obtain information on individual structures attached to the nodes and on the relationship between structures attached to nearby nodes, so as to study the underlying manifold from which the nodes are sampled. In this paper, we analyze HDM on tangent bundles, revealing its intimate connection with sub-Riemannian geometry and a family of hypoelliptic differential operators. In a later paper, we shall consider more general fibre bundles.

rate research

The Diffusion Geometry of Fibre Bundles: Horizontal Diffusion Maps

50 - Tingran Gao 2016

Kernel-based non-linear dimensionality reduction methods, such as Local Linear Embedding (LLE) and Laplacian Eigenmaps, rely heavily upon pairwise distances or similarity scores, with which one can construct and study a weighted graph associated with the dataset. When each individual data object carries additional structural details, however, the correspondence relations between these structures provide extra information that can be leveraged for studying the dataset using the graph. Based on this observation, we generalize Diffusion Maps (DM) in manifold learning and introduce the framework of Horizontal Diffusion Maps (HDM). We model a dataset with pairwise structural correspondences as a fibre bundle equipped with a connection. We demonstrate the advantage of incorporating such additional information and study the asymptotic behavior of HDM on general fibre bundles. In a broader context, HDM reveals the sub-Riemannian structure of high-dimensional datasets, and provides a nonparametric learning framework for datasets with structural correspondences.

Statistics Theory Statistics Theory

Spectral convergence of diffusion maps: improved error bounds and an alternative normalisation

103 - Caroline L. Wormell , Sebastian Reich 2020

Diffusion maps is a manifold learning algorithm widely used for dimensionality reduction. Using a sample from a distribution, it approximates the eigenvalues and eigenfunctions of associated Laplace-Beltrami operators. Theoretical bounds on the approximation error are however generally much weaker than the rates that are seen in practice. This paper uses new approaches to improve the error bounds in the model case where the distribution is supported on a hypertorus. For the data sampling (variance) component of the error we make spatially localised compact embedding estimates on certain Hardy spaces; we study the deterministic (bias) component as a perturbation of the Laplace-Beltrami operators associated PDE, and apply relevant spectral stability results. Using these approaches, we match long-standing pointwise error bounds for both the spectral data and the norm convergence of the operator discretisation. We also introduce an alternative normalisation for diffusion maps based on Sinkhorn weights. This normalisation approximates a Langevin diffusion on the sample and yields a symmetric operator approximation. We prove that it has better convergence compared with the standard normalisation on flat domains, and present a highly efficient algorithm to compute the Sinkhorn weights.

Statistics Theory Machine Learning Numerical Analysis

Natural SU(2)-structures on tangent sphere bundles

95 - R. Albuquerque 2016

We define and study natural $mathrm{SU}(2)$-structures, in the sense of Conti-Salamon, on the total space $cal S$ of the tangent sphere bundle of any given oriented Riemannian 3-manifold $M$. We recur to a fundamental exterior differential system of Riemannian geometry. Essentially, two types of structures arise: the contact-hypo and the non-contact and, for each, we study the conditions for being hypo, nearly-hypo or double-hypo. We discover new double-hypo structures on $S^3times S^2$, of which the well-known Sasaki-Einstein are a particular case. Hyperbolic geometry examples also appear. In the search of the associated metrics, we find a theorem, useful for explicitly determining the metric, which applies to all $mathrm{SU}(2)$-structures in general. Within our application to tangent sphere bundles, we discover a whole new class of metrics specific to 3d-geometry. The evolution equations of Conti-Salamon are considered; leading to a new integrable $mathrm{SU}(3)$-structure on ${cal S}timesmathbb{R}_+$ associated to any flat $M$.

Differential Geometry

P-values for classification

148 - Lutz Duembgen , Bernd-Wolfgang Igl , Axel Munk 2008

Let $(X,Y)$ be a random variable consisting of an observed feature vector $Xin mathcal{X}$ and an unobserved class label $Yin {1,2,...,L}$ with unknown joint distribution. In addition, let $mathcal{D}$ be a training data set consisting of $n$ completely observed independent copies of $(X,Y)$. Usual classification procedures provide point predictors (classifiers) $widehat{Y}(X,mathcal{D})$ of $Y$ or estimate the conditional distribution of $Y$ given $X$. In order to quantify the certainty of classifying $X$ we propose to construct for each $theta =1,2,...,L$ a p-value $pi_{theta}(X,mathcal{D})$ for the null hypothesis that $Y=theta$, treating $Y$ temporarily as a fixed parameter. In other words, the point predictor $widehat{Y}(X,mathcal{D})$ is replaced with a prediction region for $Y$ with a certain confidence. We argue that (i) this approach is advantageous over traditional approaches and (ii) any reasonable classifier can be modified to yield nonparametric p-values. We discuss issues such as optimality, single use and multiple use validity, as well as computational and graphical aspects.

Statistics Theory Machine Learning Statistics Theory

Risk-consistency of cross-validation with lasso-type procedures

196 - Darren Homrighausen , Daniel J. McDonald 2013

The lasso and related sparsity inducing algorithms have been the target of substantial theoretical and applied research. Correspondingly, many results are known about their behavior for a fixed or optimally chosen tuning parameter specified up to unknown constants. In practice, however, this oracle tuning parameter is inaccessible so one must use the data to select one. Common statistical practice is to use a variant of cross-validation for this task. However, little is known about the theoretical properties of the resulting predictions with such data-dependent methods. We consider the high-dimensional setting with random design wherein the number of predictors $p$ grows with the number of observations $n$. Under typical assumptions on the data generating process, similar to those in the literature, we recover oracle rates up to a log factor when choosing the tuning parameter with cross-validation. Under weaker conditions, when the true model is not necessarily linear, we show that the lasso remains risk consistent relative to its linear oracle. We also generalize these results to the group lasso and square-root lasso and investigate the predictive and model selection performance of cross-validation via simulation.

Statistics Theory Machine Learning Statistics Theory

comments

Fetching comments

Mamoun Private University For Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Hypoelliptic Diffusion Maps I: Tangent Bundles

Ask ChatGPT about the research

No Arabic abstract

Read More