No Arabic abstract
State-of-the-art password guessing tools, such as HashCat and John the Ripper, enable users to check billions of passwords per second against password hashes. In addition to performing straightforward dictionary attacks, these tools can expand password dictionaries using password generation rules, such as concatenation of words (e.g., password123456) and leet speak (e.g., password becomes p4s5w0rd). Although these rules work well in practice, expanding them to model further passwords is a laborious task that requires specialized expertise. To address this issue, in this paper we introduce PassGAN, a novel approach that replaces human-generated password rules with theory-grounded machine learning algorithms. Instead of relying on manual password analysis, PassGAN uses a Generative Adversarial Network (GAN) to autonomously learn the distribution of real passwords from actual password leaks, and to generate high-quality password guesses. Our experiments show that this approach is very promising. When we evaluated PassGAN on two large password datasets, we were able to surpass rule-based and state-of-the-art machine learning password guessing tools. However, in contrast with the other tools, PassGAN achieved this result without any a-priori knowledge on passwords or common password structures. Additionally, when we combined the output of PassGAN with the output of HashCat, we were able to match 51%-73% more passwords than with HashCat alone. This is remarkable, because it shows that PassGAN can autonomously extract a considerable number of password properties that current state-of-the art rules do not encode.
In recent decades, criminals have increasingly used the web to research, assist and perpetrate criminal behaviour. One of the most important ways in which law enforcement can battle this growing trend is through accessing pertinent information about suspects in a timely manner. A significant hindrance to this is the difficulty of accessing any system a suspect uses that requires authentication via password. Password guessing techniques generally consider common user behaviour while generating their passwords, as well as the password policy in place. Such techniques can offer a modest success rate considering a large/average population. However, they tend to fail when focusing on a single target -- especially when the latter is an educated user taking precautions as a savvy criminal would be expected to do. Open Source Intelligence is being increasingly leveraged by Law Enforcement in order to gain useful information about a suspect, but very little is currently being done to integrate this knowledge in an automated way within password cracking. The purpose of this research is to delve into the techniques that enable the gathering of the necessary context about a suspect and find ways to leverage this information within password guessing techniques.
Android, being the most widespread mobile operating systems is increasingly becoming a target for malware. Malicious apps designed to turn mobile devices into bots that may form part of a larger botnet have become quite common, thus posing a serious threat. This calls for more effective methods to detect botnets on the Android platform. Hence, in this paper, we present a deep learning approach for Android botnet detection based on Convolutional Neural Networks (CNN). Our proposed botnet detection system is implemented as a CNN-based model that is trained on 342 static app features to distinguish between botnet apps and normal apps. The trained botnet detection model was evaluated on a set of 6,802 real applications containing 1,929 botnets from the publicly available ISCX botnet dataset. The results show that our CNN-based approach had the highest overall prediction accuracy compared to other popular machine learning classifiers. Furthermore, the performance results observed from our model were better than those reported in previous studies on machine learning based Android botnet detection.
Vulnerabilities in password managers are unremitting because current designs provide large attack surfaces, both at the client and server. We describe and evaluate Horcrux, a password manager that is designed holistically to minimize and decentralize trust, while retaining the usability of a traditional password manager. The prototype Horcrux client, implemented as a Firefox add-on, is split into two components, with code that has access to the users masters password and any key material isolated into a small auditable component, separate from the complexity of managing the user interface. Instead of exposing actual credentials to the DOM, a dummy username and password are autofilled by the untrusted component. The trusted component intercepts and modifies POST requests before they are encrypted and sent over the network. To avoid trusting a centralized store, stored credentials are secret-shared over multiple servers. To provide domain and username privacy, while maintaining resilience to off-line attacks on a compromised password store, we incorporate cuckoo hashing in a way that ensures an attacker cannot determine if a guessed master password is correct. Our approach only works for websites that do not manipulate entered credentials in the browser client, so we conducted a large-scale experiment that found the technique appears to be compatible with over 98% of tested login forms.
In recent years, printable graphical codes have attracted a lot of attention enabling a link between the physical and digital worlds, which is of great interest for the IoT and brand protection applications. The security of printable codes in terms of their reproducibility by unauthorized parties or clonability is largely unexplored. In this paper, we try to investigate the clonability of printable graphical codes from a machine learning perspective. The proposed framework is based on a simple system composed of fully connected neural network layers. The results obtained on real codes printed by several printers demonstrate a possibility to accurately estimate digital codes from their printed counterparts in certain cases. This provides a new insight on scenarios, where printable graphical codes can be accurately cloned.
Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).