No Arabic abstract
In the current deep learning based recommendation system, the embedding method is generally employed to complete the conversion from the high-dimensional sparse feature vector to the low-dimensional dense feature vector. However, as the dimension of the input vector of the embedding layer is too large, the addition of the embedding layer significantly slows down the convergence speed of the entire neural network, which is not acceptable in real-world scenarios. In addition, as the interaction between users and items increases and the relationship between items becomes more complicated, the embedding method proposed for sequence data is no longer suitable for graphic data in the current real environment. Therefore, in this paper, we propose the Dual-modal Graph Embedding Method (DGEM) to solve these problems. DGEM includes two modes, static and dynamic. We first construct the item graph to extract the graph structure and use random walk of unequal probability to capture the high-order proximity between the items. Then we generate the graph embedding vector through the Skip-Gram model, and finally feed the downstream deep neural network for the recommendation task. The experimental results show that DGEM can mine the high-order proximity between items and enhance the expression ability of the recommendation model. Meanwhile it also improves the recommendation performance by utilizing the time dependent relationship between items.
In this paper, we describe an embedding-based entity recommendation framework for Wikipedia that organizes Wikipedia into a collection of graphs layered on top of each other, learns complementary entity representations from their topology and content, and combines them with a lightweight learning-to-rank approach to recommend related entities on Wikipedia. Through offline and online evaluations, we show that the resulting embeddings and recommendations perform well in terms of quality and user engagement. Balancing simplicity and quality, this framework provides default entity recommendations for English and other languages in the Yahoo! Knowledge Graph, which Wikipedia is a core subset of.
Robust recommendation aims at capturing true preference of users from noisy data, for which there are two lines of methods have been proposed. One is based on noise injection, and the other is to adopt the generative model Variational Auto-encoder (VAE). However, the existing works still face two challenges. First, the noise injection based methods often draw the noise from a fixed noise distribution given in advance, while in real world, the noise distributions of different users and items may differ from each other due to personal behaviors and item usage patterns. Second, the VAE based models are not expressive enough to capture the true preference since VAE often yields an embedding space of a single modal, while in real world, user-item interactions usually exhibit multi-modality on user preference distribution. In this paper, we propose a novel model called Dual Adversarial Variational Embedding (DAVE) for robust recommendation, which can provide personalized noise reduction for different users and items, and capture the multi-modality of the embedding space, by combining the advantages of VAE and adversarial training between the introduced auxiliary discriminators and the variational inference networks. The extensive experiments conducted on real datasets verify the effectiveness of DAVE on robust recommendation.
Searching for papers from different academic databases is the most commonly used method by research beginners to obtain cross-domain technical solutions. However, it is usually inefficient and sometimes even useless because traditional search methods neither consider knowledge heterogeneity in different domains nor build the bottom layer of search, including but not limited to the characteristic description text of target solutions and solutions to be excluded. To alleviate this problem, a novel paper recommendation method is proposed herein by introducing master-slave domain knowledge graphs, which not only help users express their requirements more accurately but also helps the recommendation system better express knowledge. Specifically, it is not restricted by the cold start problem and is a challenge-oriented method. To identify the rationality and usefulness of the proposed method, we selected two cross-domains and three different academic databases for verification. The experimental results demonstrate the feasibility of obtaining new technical papers in the cross-domain scenario by research beginners using the proposed method. Further, a new research paradigm for research beginners in the early stages is proposed herein.
In recent years, recommender systems play a pivotal role in helping users identify the most suitable items that satisfy personal preferences. As user-item interactions can be naturally modelled as graph-structured data, variants of graph convolutional networks (GCNs) have become a well-established building block in the latest recommenders. Due to the wide utilization of sensitive user profile data, existing recommendation paradigms are likely to expose users to the threat of privacy breach, and GCN-based recommenders are no exception. Apart from the leakage of raw user data, the fragility of current recommenders under inference attacks offers malicious attackers a backdoor to estimate users private attributes via their behavioral footprints and the recommendation results. However, little attention has been paid to developing recommender systems that can defend such attribute inference attacks, and existing works achieve attack resistance by either sacrificing considerable recommendation accuracy or only covering specific attack models or protected information. In our paper, we propose GERAI, a novel differentially private graph convolutional network to address such limitations. Specifically, in GERAI, we bind the information perturbation mechanism in differential privacy with the recommendation capability of graph convolutional networks. Furthermore, based on local differential privacy and functional mechanism, we innovatively devise a dual-stage encryption paradigm to simultaneously enforce privacy guarantee on users sensitive features and the model optimization process. Extensive experiments show the superiority of GERAI in terms of its resistance to attribute inference attacks and recommendation effectiveness.
Production recommendation systems rely on embedding methods to represent various features. An impeding challenge in practice is that the large embedding matrix incurs substantial memory footprint in serving as the number of features grows over time. We propose a similarity-aware embedding matrix compression method called Saec to address this challenge. Saec clusters similar features within a field to reduce the embedding matrix size. Saec also adopts a fast clustering optimization based on feature frequency to drastically improve clustering time. We implement and evaluate Saec on Numerous, the production distributed machine learning system in Tencent, with 10-day worth of feature data from QQ mobile browser. Testbed experiments show that Saec reduces the number of embedding vectors by two orders of magnitude, compresses the embedding size by ~27x, and delivers the same AUC and log loss performance.