ﻻ يوجد ملخص باللغة العربية
In recent years, network embedding methods have garnered increasing attention because of their effectiveness in various information retrieval tasks. The goal is to learn low-dimensional representations of vertexes in an information network and simultaneously capture and preserve the network structure. Critical to the performance of a network embedding method is how the edges/vertexes of the network is sampled for the learning process. Many existing methods adopt a uniform sampling method to reduce learning complexity, but when the network is non-uniform (i.e. a weighted network) such uniform sampling incurs information loss. The goal of this paper is to present a generalized vertex sampling framework that works seamlessly with most existing network embedding methods to support weighted instead of uniform vertex/edge sampling. For efficiency, we propose a delicate sequential vertex-to-context graph data structure, such that sampling a training pair for learning takes only constant time. For scalability and memory efficiency, we design the graph data structure in a way that keeps space consumption low without requiring additional space. In addition to implementing existing network embedding methods, the proposed framework can be used to implement extensions that feature high-order proximity modeling and weighted relation modeling. Experiments conducted on three datasets, including a commercial large-scale one, verify the effectiveness and efficiency of the proposed weighted network embedding methods on a variety of tasks, including word similarity search, multi-label classification, and item recommendation.
User representation learning is vital to capture diverse user preferences, while it is also challenging as user intents are latent and scattered among complex and different modalities of user-generated data, thus, not directly measurable. Inspired by
Sampling a network is an important prerequisite for unsupervised network embedding. Further, random walk has widely been used for sampling in previous studies. Since random walk based sampling tends to traverse adjacent neighbors, it may not be suita
Understanding the correlation between two different scores for the same set of items is a common problem in information retrieval, and the most commonly used statistics that quantifies this correlation is Kendalls $tau$. However, the standard definit
Neural node embeddings have recently emerged as a powerful representation for supervised learning tasks involving graph-structured data. We leverage this recent advance to develop a novel algorithm for unsupervised community discovery in graphs. Thro
Attributed networks are ubiquitous since a network often comes with auxiliary attribute information e.g. a social network with user profiles. Attributed Network Embedding (ANE) has recently attracted considerable attention, which aims to learn unifie