Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings

154 0 0.0 ( 0 )

Download Cite

Added by Christy Lin

Publication date 2021

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Christy Lin - Daniel Sussman - Prakash Ishwar

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties remain poorly understood. This paper studies properties of random walk based node embeddings in the unsupervised setting of discovering hidden block structure in the network, i.e., learning node representations whose cluster structure in Euclidean space reflects their adjacency structure within the network. We characterize the ergodic limits of the embedding objective, its generalization, and related convex relaxations to derive corresponding non-randomiz

rate research

Delving Into Deep Walkers: A Convergence Analysis of Random-Walk-Based Vertex Embeddings

90 - Dominik Kloepfer , Angelica I. Aviles-Rivero , Daniel Heydecker 2021

Graph vertex embeddings based on random walks have become increasingly influential in recent years, showing good performance in several tasks as they efficiently transform a graph into a more computationally digestible format while preserving relevant information. However, the theoretical properties of such algorithms, in particular the influence of hyperparameters and of the graph structure on their convergence behaviour, have so far not been well-understood. In this work, we provide a theoretical analysis for random-walks based embeddings techniques. Firstly, we prove that, under some weak assumptions, vertex embeddings derived from random walks do indeed converge both in the single limit of the number of random walks $N to infty$ and in the double limit of both $N$ and the length of each random walk $Ltoinfty$. Secondly, we derive concentration bounds quantifying the converge rate of the corpora for the single and double limits. Thirdly, we use these results to derive a heuristic for choosing the hyperparameters $N$ and $L$. We validate and illustrate the practical importance of our findings with a range of numerical and visual experiments on several graphs drawn from real-world applications.

Machine Learning Machine Learning Probability

Consistency of random-walk based network embedding algorithms

85 - Yichi Zhang , Minh Tang 2021

Random-walk based network embedding algorithms like node2vec and DeepWalk are widely used to obtain Euclidean representation of the nodes in a network prior to performing down-stream network inference tasks. Nevertheless, despite their impressive empirical performance, there is a lack of theoretical results explaining their behavior. In this paper we studied the node2vec and DeepWalk algorithms through the perspective of matrix factorization. We analyze these algorithms in the setting of community detection for stochastic blockmodel graphs; in particular we established large-sample error bounds and prove consistent community recovery of node2vec/DeepWalk embedding followed by k-means clustering. Our theoretical results indicate a subtle interplay between the sparsity of the observed networks, the window sizes of the random walks, and the convergence rates of the node2vec/DeepWalk embedding toward the embedding of the true but unknown edge probabilities matrix. More specifically, as the network becomes sparser, our results suggest using larger window sizes, or equivalently, taking longer random walks, in order to attain better convergence rate for the resulting embeddings. The paper includes numerical experiments corroborating these observations.

Machine Learning Machine Learning Social and Information Networks

Simplest random walk for approximating Robin boundary value problems and ergodic limits of reflected diffusions

73 - B. Leimkuhler , A. Sharma , M.V. Tretyakov 2020

A simple-to-implement weak-sense numerical method to approximate reflected stochastic differential equations (RSDEs) is proposed and analysed. It is proved that the method has the first order of weak convergence. Together with the Monte Carlo technique, it can be used to numerically solve linear parabolic and elliptic PDEs with Robin boundary condition. One of the key results of this paper is the use of the proposed method for computing ergodic limits, i.e. expectations with respect to the invariant law of RSDEs, both inside a domain in $mathbb{R}^{d}$ and on its boundary. This allows to efficiently sample from distributions with compact support. Both time-averaging and ensemble-averaging estimators are considered and analysed. A number of extensions are considered including a second-order weak approximation, the case of arbitrary oblique direction of reflection, and a new adaptive weak scheme to solve a Poisson PDE with Neumann boundary condition. The presented theoretical results are supported by several numerical experiments.

Numerical Analysis Numerical Analysis Probability

Stochastic Optimization of Sorting Networks via Continuous Relaxations

111 - Aditya Grover , Eric Wang , Aaron Zweig 2019

Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.

Machine Learning Machine Learning Neural and Evolutionary Computing

Learning Structural Node Embeddings Via Diffusion Wavelets

85 - Claire Donnat , Marinka Zitnik , David Hallac 2017

Nodes residing in different parts of a graph can have similar structural roles within their local network topology. The identification of such roles provides key insight into the organization of networks and can be used for a variety of machine learning tasks. However, learning structural representations of nodes is a challenging problem, and it has typically involved manually specifying and tailoring topological features for each node. In this paper, we develop GraphWave, a method that represents each nodes network neighborhood via a low-dimensional embedding by leveraging heat wavelet diffusion patterns. Instead of training on hand-selected features, GraphWave learns these embeddings in an unsupervised way. We mathematically prove that nodes with similar network neighborhoods will have similar GraphWave embeddings even though these nodes may reside in very different parts of the network, and our method scales linearly with the number of edges. Experiments in a variety of different settings demonstrate GraphWaves real-world potential for capturing structural roles in networks, and our approach outperforms existing state-of-the-art baselines in every experiment, by as much as 137%.

Social and Information Networks Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions