أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Chaochao Chen

Cross-Domain Recommendation: Challenges, Progress, and Prospects

374 - Feng Zhu , Yan Wang , Chaochao Chen 2021

To address the long-standing data sparsity problem in recommender systems (RSs), cross-domain recommendation (CDR) has been proposed to leverage the relatively richer information from a richer domain to improve the recommendation performance in a spa rser domain. Although CDR has been extensively studied in recent years, there is a lack of a systematic review of the existing CDR approaches. To fill this gap, in this paper, we provide a comprehensive review of existing CDR approaches, including challenges, research progress, and future directions. Specifically, we first summarize existing CDR approaches into four types, including single-target CDR, multi-domain recommendation, dual-target CDR, and multi-target CDR. We then present the definitions and challenges of these CDR approaches. Next, we propose a full-view categorization and new taxonomies on these approaches and report their research progress in detail. In the end, we share several promising research directions in CDR.

استرجاع المعلومات الذكاء الاصطناعي التعلم الآلي

Towards Scalable and Privacy-Preserving Deep Neural Network via Algorithmic-Cryptographic Co-design

87 - Chaochao Chen , Jun Zhou , Longfei Zheng 2020

Deep Neural Networks (DNNs) have achieved remarkable progress in various real-world applications, especially when abundant training data are provided. However, data isolation has become a serious problem currently. Existing works build privacy preser ving DNN models from either algorithmic perspective or cryptographic perspective. The former mainly splits the DNN computation graph between data holders or between data holders and server, which demonstrates good scalability but suffers from accuracy loss and potential privacy risks. In contrast, the latter leverages time-consuming cryptographic techniques, which has strong privacy guarantee but poor scalability. In this paper, we propose SPNN - a Scalable and Privacy-preserving deep Neural Network learning framework, from algorithmic-cryptographic co-perspective. From algorithmic perspective, we split the computation graph of DNN models into two parts, i.e., the private data related computations that are performed by data holders and the rest heavy computations that are delegated to a server with high computation ability. From cryptographic perspective, we propose using two types of cryptographic techniques, i.e., secret sharing and homomorphic encryption, for the isolated data holders to conduct private data related computations privately and cooperatively. Furthermore, we implement SPNN in a decentralized setting and introduce user-friendly APIs. Experimental results conducted on real-world datasets demonstrate the superiority of SPNN.

التعلم الآلي التشفير والأمن

ASFGNN: Automated Separated-Federated Graph Neural Network

31 - Longfei Zheng , Jun Zhou , Chaochao Chen 2020

Graph Neural Networks (GNNs) have achieved remarkable performance by taking advantage of graph data. The success of GNN models always depends on rich features and adjacent relationships. However, in practice, such data are usually isolated by differe nt data owners (clients) and thus are likely to be Non-Independent and Identically Distributed (Non-IID). Meanwhile, considering the limited network status of data owners, hyper-parameters optimization for collaborative learning approaches is time-consuming in data isolation scenarios. To address these problems, we propose an Automated Separated-Federated Graph Neural Network (ASFGNN) learning paradigm. ASFGNN consists of two main components, i.e., the training of GNN and the tuning of hyper-parameters. Specifically, to solve the data Non-IID problem, we first propose a separated-federated GNN learning model, which decouples the training of GNN into two parts: the message passing part that is done by clients separately, and the loss computing part that is learnt by clients federally. To handle the time-consuming parameter tuning problem, we leverage Bayesian optimization technique to automatically tune the hyper-parameters of all the clients. We conduct experiments on benchmark datasets and the results demonstrate that ASFGNN significantly outperforms the naive federated GNN, in terms of both accuracy and parameter-tuning efficiency.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

Vertically Federated Graph Neural Network for Privacy-Preserving Node Classification

74 - Jun Zhou , Chaochao Chen , Longfei Zheng 2020

Recently, Graph Neural Network (GNN) has achieved remarkable progresses in various real-world tasks on graph data, consisting of node features and the adjacent information between different nodes. High-performance GNN models always depend on both ric h features and complete edge information in graph. However, such information could possibly be isolated by different data holders in practice, which is the so-called data isolation problem. To solve this problem, in this paper, we propose VFGNN, a federated GNN learning paradigm for privacy-preserving node classification task under data vertically partitioned setting, which can be generalized to existing GNN models. Specifically, we split the computation graph into two parts. We leave the private data (i.e., features, edges, and labels) related computations on data holders, and delegate the rest of computations to a semi-honest server. We also propose to apply differential privacy to prevent potential information leakage from the server. We conduct experiments on three benchmarks and the results demonstrate the effectiveness of VFGNN.

التعلم الآلي التشفير والأمن التعلم الالي

Industrial Scale Privacy Preserving Deep Neural Network

104 - Longfei Zheng , Chaochao Chen , Yingting Liu 2020

Deep Neural Network (DNN) has been showing great potential in kinds of real-world applications such as fraud detection and distress prediction. Meanwhile, data isolation has become a serious problem currently, i.e., different parties cannot share dat a with each other. To solve this issue, most research leverages cryptographic techniques to train secure DNN models for multi-parties without compromising their private data. Although such methods have strong security guarantee, they are difficult to scale to deep networks and large datasets due to its high communication and computation complexities. To solve the scalability of the existing secure Deep Neural Network (DNN) in data isolation scenarios, in this paper, we propose an industrial scale privacy preserving neural network learning paradigm, which is secure against semi-honest adversaries. Our main idea is to split the computation graph of DNN into two parts, i.e., the computations related to private data are performed by each party using cryptographic techniques, and the rest computations are done by a neutral server with high computation ability. We also present a defender mechanism for further privacy protection. We conduct experiments on real-world fraud detection dataset and financial distress prediction dataset, the encouraging results demonstrate the practicalness of our proposal.

التعلم الآلي التشفير والأمن التعلم الالي

Practical Privacy Preserving POI Recommendation

418 - Chaochao Chen , Jun Zhou , Bingzhe Wu 2020

Point-of-Interest (POI) recommendation has been extensively studied and successfully applied in industry recently. However, most existing approaches build centralized models on the basis of collecting users data. Both private data and models are held by the recommender, which causes serious privacy concerns. In this paper, we propose a novel Privacy preserving POI Recommendation (PriRec) framework. First, to protect data privacy, users private data (features and actions) are kept on their own side, e.g., Cellphone or Pad. Meanwhile, the public data need to be accessed by all the users are kept by the recommender to reduce the storage costs of users devices. Those public data include: (1) static data only related to the status of POI, such as POI categories, and (2) dynamic data depend on user-POI actions such as visited counts. The dynamic data could be sensitive, and we develop local differential privacy techniques to release such data to public with privacy guarantees. Second, PriRec follows the representations of Factorization Machine (FM) that consists of linear model and the feature interaction model. To protect the model privacy, the linear models are saved on users side, and we propose a secure decentralized gradient descent protocol for users to learn it collaboratively. The feature interaction model is kept by the recommender since there is no privacy risk, and we adopt secure aggregation strategy in federated learning paradigm to learn it. To this end, PriRec keeps users private raw data and models in users own hands, and protects user privacy to a large extent. We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with FM, PriRec achieves comparable or even better recommendation accuracy.

التشفير والأمن التعلم الآلي التعلم الالي

Privacy Preserving PCA for Multiparty Modeling

92 - Yingting Liu , Chaochao Chen , Longfei Zheng 2020

In this paper, we present a general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data. PPPCA can accomplish multiparty cooperative execution of PCA under the premise of keeping plaintext data locally. We also propose implementations using two techniques, i.e., homomorphic encryption and secret sharing. The output of PPPCA can be sent directly to data consumer to build any machine learning models. We conduct experiments on three UCI benchmark datasets and a real-world fraud detection dataset. Results show that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized plaintext data.

التشفير والأمن التعلم الآلي

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

357 - Bingzhe Wu , Shiwan Zhao , ChaoChao Chen 2019

In this paper, we aim to understand the generalization properties of generative adversarial networks (GANs) from a new perspective of privacy protection. Theoretically, we prove that a differentially private learning algorithm used for training the G AN does not overfit to a certain degree, i.e., the generalization gap can be bounded. Moreover, some recent works, such as the Bayesian GAN, can be re-interpreted based on our theoretical insight from privacy protection. Quantitatively, to evaluate the information leakage of well-trained GAN models, we perform various membership attacks on these models. The results show that previous Lipschitz regularization techniques are effective in not only reducing the generalization gap but also alleviating the information leakage of the training dataset.

التعلم الآلي التشفير والأمن التعلم الالي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد