Stochastic Cluster Embedding

66 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhirong Yang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhirong Yang - Yuwei Chen - Denis Sedov

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Neighbor Embedding (NE) aims to preserve pairwise similarities between data items and has been shown to yield an effective principle for data visualization. However, even the best existing NE methods such as Stochastic Neighbor Embedding (SNE) may leave large-scale patterns hidden, for example clusters, despite strong signals being present in the data. To address this, we propose a new cluster visualization method based on the Neighbor Embedding principle. We first present a family of Neighbor Embedding methods that generalizes SNE by using non-normalized Kullback-Leibler divergence with a scale parameter. In this family, much better cluster visualizations often appear with a parameter value different from the one corresponding to SNE. We also develop an efficient software that employs asynchronous stochastic block coordinate descent to optimize the new family of objective functions. Our experimental results demonstrate that the method consistently and substantially improves the visualization of data clusters compared with the state-of-the-art NE approaches.

قيم البحث

256 - P.-R. Wagner , S. Marelli , I. Papaioannou 2021

Estimating the probability of rare failure events is an essential step in the reliability assessment of engineering systems. Computing this failure probability for complex non-linear systems is challenging, and has recently spurred the development of active-learning reliability methods. These methods approximate the limit-state function (LSF) using surrogate models trained with a sequentially enriched set of model evaluations. A recently proposed method called stochastic spectral embedding (SSE) aims to improve the local approximation accuracy of global, spectral surrogate modelling techniques by sequentially embedding local residual expansions in subdomains of the input space. In this work we apply SSE to the LSF, giving rise to a stochastic spectral embedding-based reliability (SSER) method. The resulting partition of the input space decomposes the failure probability into a set of easy-to-compute domain-wise failure probabilities. We propose a set of modifications that tailor the algorithm to efficiently solve rare event estimation problems. These modifications include specialized refinement domain selection, partitioning and enrichment strategies. We showcase the algorithm performance on four benchmark problems of various dimensionality and complexity in the LSF.

التعلم الآلي حساب المنهجية

Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers

69 - Liwei Wu , Shuqing Li , Cho-Jui Hsieh 2019

In deep neural nets, lower level embedding layers account for a large portion of the total number of parameters. Tikhonov regularization, graph-based regularization, and hard parameter sharing are approaches that introduce explicit biases into traini ng in a hope to reduce statistical complexity. Alternatively, we propose stochastically shared embeddings (SSE), a data-driven approach to regularizing embedding layers, which stochastically transitions between embeddings during stochastic gradient descent (SGD). Because SSE integrates seamlessly with existing SGD algorithms, it can be used with only minor modifications when training large scale neural networks. We develop tw

التعلم الآلي التعلم الالي

A Framework for Joint Unsupervised Learning of Cluster-Aware Embedding for Heterogeneous Networks

70 - Rayyan Ahmad Khan , Martin Kleinsteuber 2021

Heterogeneous Information Network (HIN) embedding refers to the low-dimensional projections of the HIN nodes that preserve the HIN structure and semantics. HIN embedding has emerged as a promising research field for network analysis as it enables dow nstream tasks such as clustering and node classification. In this work, we propose ours for joint learning of cluster embeddings as well as cluster-aware HIN embedding. We assume that the connected nodes are highly likely to fall in the same cluster, and adopt a variational approach to preserve the information in the pairwise relations in a cluster-aware manner. In addition, we deploy contrastive modules to simultaneously utilize the information in multiple meta-paths, thereby alleviating the meta-path selection problem - a challenge faced by many of the famous HIN embedding approaches. The HIN embedding, thus learned, not only improves the clustering performance but also preserves pairwise proximity as well as the high-order HIN structure. We show the effectiveness of our approach by comparing it with many competitive baselines on three real-world datasets on clustering and downstream node classification.

التعلم الآلي

Customized Graph Embedding: Tailoring Embedding Vectors to different Applications

257 - Bitan Hou , Yujing Wang , Ming Zeng 2019

Graph is a natural representation of data for a variety of real-word applications, such as knowledge graph mining, social network analysis and biological network comparison. For these applications, graph embedding is crucial as it provides vector rep resentations of the graph. One limitation of existing graph embedding methods is that their embedding optimization procedures are disconnected from the target application. In this paper, we propose a novel approach, namely Customized Graph Embedding (CGE) to tackle this problem. The CGE algorithm learns customized vector representations of graph nodes by differentiating the importance of distinct graph paths automatically for a specific application. Extensive experiments were carried out on a diverse set of node classification datasets, which demonstrate strong performances of CGE and provide deep insights into the model.

التعلم الآلي الشبكات الاجتماعية والمعلومات

From Node Embedding To Community Embedding : A Hyperbolic Approach

325 - Thomas Gerald , Hadi Zaatiti , Hatem Hajri 2019

Detecting communities on graphs has received significant interest in recent literature. Current state-of-the-art community embedding approach called textit{ComE} tackles this problem by coupling graph embedding with community detection. Considering t he success of hyperbolic representations of graph-structured data in last years, an ongoing challenge is to set up a hyperbolic approach for the community detection problem. The present paper meets this challenge by introducing a Riemannian equivalent of textit{ComE}. Our proposed approach combines hyperbolic embeddings with Riemannian K-means or Riemannian mixture models to perform community detection. We illustrate the usefulness of this framework through several experiments on real-world social networks and comparisons with textit{ComE} and recent hyperbolic-based classification approaches.

التعلم الآلي التعلم الالي