ﻻ يوجد ملخص باللغة العربية
Organisations disclose their privacy practices by posting privacy policies on their website. Even though users often care about their digital privacy, they often dont read privacy policies since they require a significant investment in time and effort. Although natural language processing can help in privacy policy understanding, there has been a lack of large scale privacy policy corpora that could be used to analyse, understand, and simplify privacy policies. Thus, we create PrivaSeer, a corpus of over one million English language website privacy policies, which is significantly larger than any previously available corpus. We design a corpus creation pipeline which consists of crawling the web followed by filtering documents using language detection, document classification, duplicate and near-duplication removal, and content extraction. We investigate the composition of the corpus and show results from readability tests, document similarity, keyphrase extraction, and explored the corpus through topic modeling.
As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while m
Deep Neural Network (DNN) has been showing great potential in kinds of real-world applications such as fraud detection and distress prediction. Meanwhile, data isolation has become a serious problem currently, i.e., different parties cannot share dat
Contextual bandit algorithms~(CBAs) often rely on personal data to provide recommendations. Centralized CBA agents utilize potentially sensitive data from recent interactions to provide personalization to end-users. Keeping the sensitive data locally
In this paper, we propose FedGP, a framework for privacy-preserving data release in the federated learning setting. We use generative adversarial networks, generator components of which are trained by FedAvg algorithm, to draw privacy-preserving arti
Recent research in differential privacy demonstrated that (sub)sampling can amplify the level of protection. For example, for $epsilon$-differential privacy and simple random sampling with sampling rate $r$, the actual privacy guarantee is approximat