Privacy-Preserving Distributed Parameter Estimation for Probability Distribution of Wind Power Forecast Error

74 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mengshuo Jia

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mengshuo Jia - Shaowei Huang - Zhiwen Wang

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Building the conditional probability distribution of wind power forecast errors benefits both wind farms (WFs) and independent system operators (ISOs). Establishing the joint probability distribution of wind power and the corresponding forecast data of spatially correlated WFs is the foundation for deriving the conditional probability distribution. Traditional parameter estimation methods for probability distributions require the collection of historical data of all WFs. However, in the context of multi-regional interconnected grids, neither regional ISOs nor WFs can collect the raw data of WFs in other regions due to privacy or competition considerations. Therefore, based on the Gaussian mixture model, this paper first proposes a privacy-preserving distributed expectation-maximization algorithm to estimate the parameters of the joint probability distribution. This algorithm consists of two original methods: (1) a privacy-preserving distributed summation algorithm and (2) a privacy-preserving distributed inner product algorithm. Then, we derive each WFs conditional probability distribution of forecast error from the joint one. By the proposed algorithms, WFs only need local calculations and privacy-preserving neighboring communications to achieve the whole parameter estimation. These algorithms are verified using the wind integration data set published by the NREL.

قيم البحث

68 - Mengshuo Jia , Chen Shen , Zhaojian Wang 2019

Due to the uncertainty of distributed wind generations (DWGs), a better understanding of the probability distributions (PD) of their wind power forecast errors (WPFEs) can help market participants (MPs) who own DWGs perform better during trading. Und er the premise of an accurate PD model, considering the correlation among DWGs and absorbing the new information carried by the latest data are two ways to maintain an accurate PD. These two ways both require the historical and latest wind power and forecast data of all DWGs. Each MP, however, only has access to the data of its own DWGs and may refuse to share these data with MPs belonging to other stakeholders. Besides, because of the endless generation of new data, the PD updating burden increases sharply. Therefore, we use the distributed strategy to deal with the data collection problem. In addition, we further apply the incremental learning strategy to reduce the updating burden. Finally, we propose a distributed incremental update scheme to make each MP continually acquire the latest conditional PD of its DWGs WPFE. Specifically, we first use the Gaussian-mixture-model-based (GMM-based) joint PD to characterize the correlation among DWGs. Then, we propose a distributed modified incremental GMM algorithm to enable MPs to update the parameters of the joint PD in a distributed and incremental manner. After that, we further propose a distributed derivation algorithm to make MPs derive their conditional PD of WPFE from the joint one in a distributed way. Combining the two original algorithms, we finally achieve the complete distributed incremental update scheme, by which each MP can continually obtain its latest conditional PD of its DWGs WPFE via neighborhood communication and local calculation with its own data. The effectiveness, correctness, and efficiency of the proposed scheme are verified using the dataset from the NREL.

أنظمة وتحكم معالجة الإشارات

Privacy-Preserving Distributed Learning in the Analog Domain

154 - Mahdi Soleymani , Hessam Mahdavifar , A. Salman Avestimehr 2020

We consider the critical problem of distributed learning over data while keeping it private from the computational servers. The state-of-the-art approaches to this problem rely on quantizing the data into a finite field, so that the cryptographic app roaches for secure multiparty computing can then be employed. These approaches, however, can result in substantial accuracy losses due to fixed-point representation of the data and computation overflows. To address these critical issues, we propose a novel algorithm to solve the problem when data is in the analog domain, e.g., the field of real/complex numbers. We characterize the privacy of the data from both information-theoretic and cryptographic perspectives, while establishing a connection between the two notions in the analog domain. More specifically, the well-known connection between the distinguishing security (DS) and the mutual information security (MIS) metrics is extended from the discrete domain to the continues domain. This is then utilized to bound the amount of information about the data leaked to the servers in our protocol, in terms of the DS metric, using well-known results on the capacity of single-input multiple-output (SIMO) channel with correlated noise. It is shown how the proposed framework can be adopted to do computation tasks when data is represented using floating-point numbers. We then show that this leads to a fundamental trade-off between the privacy level of data and accuracy of the result. As an application, we also show how to train a machine learning model while keeping the data as well as the trained model private. Then numerical results are shown for experiments on the MNIST dataset. Furthermore, experimental advantages are shown comparing to fixed-point implementations over finite fields.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

Stochastic Channel-Based Federated Learning for Medical Data Privacy Preserving

185 - Rulin Shao , Hongyu He , Hui Liu 2019

Artificial neural network has achieved unprecedented success in the medical domain. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to ta ke control over their sensitive information during both training and using processes. To address this problem, we propose a privacy-preserving method for the distributed system, Stochastic Channel-Based Federated Learning (SCBF), which enables the participants to train a high-performance model cooperatively without sharing their inputs. Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system, which selects the channels with regard to the most active features in a training loop and uploads them as learned information from local datasets. A pruning process is applied to the algorithm based on the validation set, which serves as a model accelerator. In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method which reveals all the parameters of local models to the server when updating. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process. And further improvement could be achieved by tuning the pruning rate. Our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

253 - Lixin Fan , Kam Woh Ng , Ce Ju 2020

This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by recon struction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that attacks are guaranteed to be defeated if condition (2) is unfulfilled. Third, based on theoretical analysis, a novel Secret Polarization Network (SPN) is proposed to thwart privacy attacks, which pose serious challenges to existing PPDL methods. Extensive experiments showed that model accuracies are improved on average by 5-20% compared with baseline mechanisms, in regimes where data privacy are satisfactorily protected.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning

89 - Jinhyun So , Basak Guler , A. Salman Avestimehr 2019

How to train a machine learning model while keeping the data private and secure? We present CodedPrivateML, a fast and scalable approach to this critical problem. CodedPrivateML keeps both the data and the model information-theoretically private, whi le allowing efficient parallelization of training across distributed workers. We characterize CodedPrivateMLs privacy threshold and prove its convergence for logistic (and linear) regression. Furthermore, via extensive experiments on Amazon EC2, we demonstrate that CodedPrivateML provides significant speedup over cryptographic approaches based on multi-party computing (MPC).

التعلم الآلي التشفير والأمن نظرية المعلومات