No Arabic abstract
Kernel-based non-linear dimensionality reduction methods, such as Local Linear Embedding (LLE) and Laplacian Eigenmaps, rely heavily upon pairwise distances or similarity scores, with which one can construct and study a weighted graph associated with the dataset. When each individual data object carries additional structural details, however, the correspondence relations between these structures provide extra information that can be leveraged for studying the dataset using the graph. Based on this observation, we generalize Diffusion Maps (DM) in manifold learning and introduce the framework of Horizontal Diffusion Maps (HDM). We model a dataset with pairwise structural correspondences as a fibre bundle equipped with a connection. We demonstrate the advantage of incorporating such additional information and study the asymptotic behavior of HDM on general fibre bundles. In a broader context, HDM reveals the sub-Riemannian structure of high-dimensional datasets, and provides a nonparametric learning framework for datasets with structural correspondences.
We introduce the concept of Hypoelliptic Diffusion Maps (HDM), a framework generalizing Diffusion Maps in the context of manifold learning and dimensionality reduction. Standard non-linear dimensionality reduction methods (e.g., LLE, ISOMAP, Laplacian Eigenmaps, Diffusion Maps) focus on mining massive data sets using weighted affinity graphs; Orientable Diffusion Maps and Vector Diffusion Maps enrich these graphs by attaching to each node also some local geometry. HDM likewise considers a scenario where each node possesses additional structure, which is now itself of interest to investigate. Virtually, HDM augments the original data set with attached structures, and provides tools for studying and organizing the augmented ensemble. The goal is to obtain information on individual structures attached to the nodes and on the relationship between structures attached to nearby nodes, so as to study the underlying manifold from which the nodes are sampled. In this paper, we analyze HDM on tangent bundles, revealing its intimate connection with sub-Riemannian geometry and a family of hypoelliptic differential operators. In a later paper, we shall consider more general fibre bundles.
We present a new adaptive kernel density estimator based on linear diffusion processes. The proposed estimator builds on existing ideas for adaptive smoothing by incorporating information from a pilot density estimate. In addition, we propose a new plug-in bandwidth selection method that is free from the arbitrary normal reference rules used by existing methods. We present simulation examples in which the proposed approach outperforms existing methods in terms of accuracy and reliability.
We propose an update estimation method for a diffusion parameter from high-frequency dependent data under a nuisance drift element. We ensure the asymptotic equivalence of the estimator to the corresponding quasi-MLE, which has the asymptotic normality and the asymptotic efficiency. We give a simulation example to illustrate the theory.
In this paper we consider an ergodic diffusion process with jumps whose drift coefficient depends on an unknown parameter $theta$. We suppose that the process is discretely observed at the instants (t n i)i=0,...,n with $Delta$n = sup i=0,...,n--1 (t n i+1 -- t n i) $rightarrow$ 0. We introduce an estimator of $theta$, based on a contrast function, which is efficient without requiring any conditions on the rate at which $Delta$n $rightarrow$ 0, and where we allow the observed process to have non summable jumps. This extends earlier results where the condition n$Delta$ 3 n $rightarrow$ 0 was needed (see [10],[24]) and where the process was supposed to have summable jumps. Moreover, in the case of a finite jump activity, we propose explicit approximations of the contrast function, such that the efficient estimation of $theta$ is feasible under the condition that n$Delta$ k n $rightarrow$ 0 where k > 0 can be arbitrarily large. This extends the results obtained by Kessler [15] in the case of continuous processes. L{e}vy-driven SDE, efficient drift estimation, high frequency data, ergodic properties, thresholding methods.
We aim at estimating the invariant density associated to a stochastic differential equation with jumps in low dimension, which is for $d=1$ and $d=2$. We consider a class of jump diffusion processes whose invariant density belongs to some Holder space. Firstly, in dimension one, we show that the kernel density estimator achieves the convergence rate $frac{1}{T}$, which is the optimal rate in the absence of jumps. This improves the convergence rate obtained in [Amorino, Gloter (2021)], which depends on the Blumenthal-Getoor index for $d=1$ and is equal to $frac{log T}{T}$ for $d=2$. Secondly, we show that is not possible to find an estimator with faster rates of estimation. Indeed, we get some lower bounds with the same rates ${frac{1}{T},frac{log T}{T}}$ in the mono and bi-dimensional cases, respectively. Finally, we obtain the asymptotic normality of the estimator in the one-dimensional case.