ترغب بنشر مسار تعليمي؟ اضغط هنا

Stochastic Cluster Embedding

66   0   0.0 ( 0 )
 نشر من قبل Zhirong Yang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Neighbor Embedding (NE) aims to preserve pairwise similarities between data items and has been shown to yield an effective principle for data visualization. However, even the best existing NE methods such as Stochastic Neighbor Embedding (SNE) may leave large-scale patterns hidden, for example clusters, despite strong signals being present in the data. To address this, we propose a new cluster visualization method based on the Neighbor Embedding principle. We first present a family of Neighbor Embedding methods that generalizes SNE by using non-normalized Kullback-Leibler divergence with a scale parameter. In this family, much better cluster visualizations often appear with a parameter value different from the one corresponding to SNE. We also develop an efficient software that employs asynchronous stochastic block coordinate descent to optimize the new family of objective functions. Our experimental results demonstrate that the method consistently and substantially improves the visualization of data clusters compared with the state-of-the-art NE approaches.

قيم البحث

اقرأ أيضاً

Estimating the probability of rare failure events is an essential step in the reliability assessment of engineering systems. Computing this failure probability for complex non-linear systems is challenging, and has recently spurred the development of active-learning reliability methods. These methods approximate the limit-state function (LSF) using surrogate models trained with a sequentially enriched set of model evaluations. A recently proposed method called stochastic spectral embedding (SSE) aims to improve the local approximation accuracy of global, spectral surrogate modelling techniques by sequentially embedding local residual expansions in subdomains of the input space. In this work we apply SSE to the LSF, giving rise to a stochastic spectral embedding-based reliability (SSER) method. The resulting partition of the input space decomposes the failure probability into a set of easy-to-compute domain-wise failure probabilities. We propose a set of modifications that tailor the algorithm to efficiently solve rare event estimation problems. These modifications include specialized refinement domain selection, partitioning and enrichment strategies. We showcase the algorithm performance on four benchmark problems of various dimensionality and complexity in the LSF.
In deep neural nets, lower level embedding layers account for a large portion of the total number of parameters. Tikhonov regularization, graph-based regularization, and hard parameter sharing are approaches that introduce explicit biases into traini ng in a hope to reduce statistical complexity. Alternatively, we propose stochastically shared embeddings (SSE), a data-driven approach to regularizing embedding layers, which stochastically transitions between embeddings during stochastic gradient descent (SGD). Because SSE integrates seamlessly with existing SGD algorithms, it can be used with only minor modifications when training large scale neural networks. We develop tw
Heterogeneous Information Network (HIN) embedding refers to the low-dimensional projections of the HIN nodes that preserve the HIN structure and semantics. HIN embedding has emerged as a promising research field for network analysis as it enables dow nstream tasks such as clustering and node classification. In this work, we propose ours for joint learning of cluster embeddings as well as cluster-aware HIN embedding. We assume that the connected nodes are highly likely to fall in the same cluster, and adopt a variational approach to preserve the information in the pairwise relations in a cluster-aware manner. In addition, we deploy contrastive modules to simultaneously utilize the information in multiple meta-paths, thereby alleviating the meta-path selection problem - a challenge faced by many of the famous HIN embedding approaches. The HIN embedding, thus learned, not only improves the clustering performance but also preserves pairwise proximity as well as the high-order HIN structure. We show the effectiveness of our approach by comparing it with many competitive baselines on three real-world datasets on clustering and downstream node classification.
Graph is a natural representation of data for a variety of real-word applications, such as knowledge graph mining, social network analysis and biological network comparison. For these applications, graph embedding is crucial as it provides vector rep resentations of the graph. One limitation of existing graph embedding methods is that their embedding optimization procedures are disconnected from the target application. In this paper, we propose a novel approach, namely Customized Graph Embedding (CGE) to tackle this problem. The CGE algorithm learns customized vector representations of graph nodes by differentiating the importance of distinct graph paths automatically for a specific application. Extensive experiments were carried out on a diverse set of node classification datasets, which demonstrate strong performances of CGE and provide deep insights into the model.
Detecting communities on graphs has received significant interest in recent literature. Current state-of-the-art community embedding approach called textit{ComE} tackles this problem by coupling graph embedding with community detection. Considering t he success of hyperbolic representations of graph-structured data in last years, an ongoing challenge is to set up a hyperbolic approach for the community detection problem. The present paper meets this challenge by introducing a Riemannian equivalent of textit{ComE}. Our proposed approach combines hyperbolic embeddings with Riemannian K-means or Riemannian mixture models to perform community detection. We illustrate the usefulness of this framework through several experiments on real-world social networks and comparisons with textit{ComE} and recent hyperbolic-based classification approaches.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا