ترغب بنشر مسار تعليمي؟ اضغط هنا

Adaptive Manifold Clustering

97   0   0.0 ( 0 )
 نشر من قبل Franz Besold
 تاريخ النشر 2019
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Clustering methods seek to partition data such that elements are more similar to elements in the same cluster than to elements in different clusters. The main challenge in this task is the lack of a unified definition of a cluster, especially for high dimensional data. Different methods and approaches have been proposed to address this problem. This paper continues the study originated by Efimov, Adamyan and Spokoiny (2019) where a novel approach to adaptive nonparametric clustering called Adaptive Weights Clustering (AWC) was offered. The method allows analyzing high-dimensional data with an unknown number of unbalanced clusters of arbitrary shape under very weak modeling assumptions. The procedure demonstrates a state-of-the-art performance and is very efficient even for large data dimension D. However, the theoretical study in Efimov, Adamyan and Spokoiny (2019) is very limited and did not really address the question of efficiency. This paper makes a significant step in understanding the remarkable performance of the AWC procedure, particularly in high dimension. The approach is based on combining the ideas of adaptive clustering and manifold learning. The manifold hypothesis means that high dimensional data can be well approximated by a d-dimensional manifold for small d helping to overcome the curse of dimensionality problem and to get sharp bounds on the cluster separation which only depend on the intrinsic dimension d. We also address the problem of parameter tuning. Our general theoretical results are illustrated by some numerical experiments.



قيم البحث

اقرأ أيضاً

We consider a problem of manifold estimation from noisy observations. Many manifold learning procedures locally approximate a manifold by a weighted average over a small neighborhood. However, in the presence of large noise, the assigned weights beco me so corrupted that the averaged estimate shows very poor performance. We suggest a novel computationally efficient structure-adaptive procedure which simultaneously reconstructs a smooth manifold and estimates projections of the point cloud onto this manifold. The proposed approach iteratively refines the weights on each step, using the structural information obtained at previous steps. After several iterations, we obtain nearly oracle weights, so that the final estimates are nearly efficient even in the presence of relatively large noise. In our theoretical study we establish tight lower and upper bounds proving asymptotic optimality of the method for manifold estimation under the Hausdorff loss, provided that the noise degrades to zero fast enough.
Prediction for high dimensional time series is a challenging task due to the curse of dimensionality problem. Classical parametric models like ARIMA or VAR require strong modeling assumptions and time stationarity and are often overparametrized. This paper offers a new flexible approach using recent ideas of manifold learning. The considered model includes linear models such as the central subspace model and ARIMA as particular cases. The proposed procedure combines manifold denoising techniques with a simple nonparametric prediction by local averaging. The resulting procedure demonstrates a very reasonable performance for real-life econometric time series. We also provide a theoretical justification of the manifold estimation procedure.
362 - R. Fraiman , F. Gamboa , L. Moreno 2018
In the context of computer code experiments, sensitivity analysis of a complicated input-output system is often performed by ranking the so-called Sobol indices. One reason of the popularity of Sobols approach relies on the simplicity of the statisti cal estimation of these indices using the so-called Pick and Freeze method. In this work we propose and study sensitivity indices for the case where the output lies on a Riemannian manifold. These indices are based on a Cramer von Mises like criterion that takes into account the geometry of the output support. We propose a Pick-Freeze like estimator of these indices based on an $U$--statistic. The asymptotic properties of these estimators are studied. Further, we provide and discuss some interesting numerical examples.
We discuss parametric estimation of a degenerate diffusion system from time-discrete observations. The first component of the degenerate diffusion system has a parameter $theta_1$ in a non-degenerate diffusion coefficient and a parameter $theta_2$ in the drift term. The second component has a drift term parameterized by $theta_3$ and no diffusion term. Asymptotic normality is proved in three different situations for an adaptive estimator for $theta_3$ with some initial estimators for ($theta_1$ , $theta_2$), an adaptive one-step estimator for ($theta_1$ , $theta_2$ , $theta_3$) with some initial estimators for them, and a joint quasi-maximum likelihood estimator for ($theta_1$ , $theta_2$ , $theta_3$) without any initial estimator. Our estimators incorporate information of the increments of both components. Thanks to this construction, the asymptotic variance of the estimators for $theta_1$ is smaller than the standard one based only on the first component. The convergence of the estimators for $theta_3$ is much faster than the other parameters. The resulting asymptotic variance is smaller than that of an estimator only using the increments of the second component.
We present a geometrical method for analyzing sequential estimating procedures. It is based on the design principle of the second-order efficient sequential estimation provided in Okamoto, Amari and Takeuchi (1991). By introducing a dual conformal cu rvature quantity, we clarify the conditions for the covariance minimization of sequential estimators. These conditions are further elabolated for the multidimensional curved exponential family. The theoretical results are then numerically examined by using typical statistical models, von Mises-Fisher and hyperboloid models.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا