No Arabic abstract
Widely-used Android static program analysis tools, e.g., Amandroid and FlowDroid, perform the whole-app inter-procedural analysis that is comprehensive but fundamentally difficult to handle modern (large) apps. The average app size has increased three to four times over five years. In this paper, we explore a new paradigm of targeted inter-procedural analysis that can skip irrelevant code and focus only on the flows of security-sensitive sink APIs. To this end, we propose a technique called on-the-fly bytecode search, which searches the disassembled app bytecode text just in time when a caller needs to be located. In this way, it guides targeted (and backward) inter-procedural analysis step by step until reaching entry points, without relying on a whole-app graph. Such search-based inter-procedural analysis, however, is challenging due to Java polymorphism, callbacks, asynchronous flows, static initializers, and inter-component communication in Android apps. We overcome these unique obstacles in our context by proposing a set of bytecode search mechanisms that utilize flexible searches and forward object taint analysis. Atop of this new inter-procedural analysis, we further adjust the traditional backward slicing and forward constant propagation to provide the complete dataflow tracking of sink API calls. We have implemented a prototype called BackDroid and compared it with Amandroid in analyzing 3,178 modern popular apps for crypto and SSL misconfigurations. The evaluation shows that for such sink-based problems, BackDroid is 37 times faster (2.13 v.s. 78.15 minutes) and has no timed-out failure (v.s. 35% in Amandroid), while maintaining close or even better detection effectiveness.
A fundamental premise of SMS One-Time Password (OTP) is that the used pseudo-random numbers (PRNs) are uniquely unpredictable for each login session. Hence, the process of generating PRNs is the most critical step in the OTP authentication. An improper implementation of the pseudo-random number generator (PRNG) will result in predictable or even static OTP values, making them vulnerable to potential attacks. In this paper, we present a vulnerability study against PRNGs implemented for Android apps. A key challenge is that PRNGs are typically implemented on the server-side, and thus the source code is not accessible. To resolve this issue, we build an analysis tool, sysname, to assess implementations of the PRNGs in an automated manner without the source code requirement. Through reverse engineering, sysname identifies the apps using SMS OTP and triggers each apps login functionality to retrieve OTP values. It further assesses the randomness of the OTP values to identify vulnerable PRNGs. By analyzing 6,431 commercially used Android apps downloaded from tool{Google Play} and tool{Tencent Myapp}, sysname identified 399 vulnerable apps that generate predictable OTP values. Even worse, 194 vulnerable apps use the OTP authentication alone without any additional security mechanisms, leading to insecure authentication against guessing attacks and replay attacks.
Third-party security apps are an integral part of the Android app ecosystem. Many users install them as an extra layer of protection for their devices. There are hundreds of such security apps, both free and paid in Google Play Store and some of them are downloaded millions of times. By installing security apps, the smartphone users place a significant amount of trust towards the security companies who developed these apps, because a fully functional mobile security app requires access to many smartphone resources such as the storage, text messages and email, browser history, and information about other installed applications. Often these resources contain highly sensitive personal information. As such, it is essential to understand the mobile security apps ecosystem to assess whether is it indeed beneficial to install them. To this end, in this paper, we present the first empirical study of Android security apps. We analyse 100 Android security apps from multiple aspects such as metadata, static analysis, and dynamic analysis and presents insights to their operations and behaviours. Our results show that 20% of the security apps we studied potentially resell the data they collect from smartphones to third parties; in some cases, even without the user consent. Also, our experiments show that around 50% of the security apps fail to identify malware installed on a smartphone.
Android is present in more than 85% of mobile devices, making it a prime target for malware. Malicious code is becoming increasingly sophisticated and relies on logic bombs to hide itself from dynamic analysis. In this paper, we perform a large scale study of TSOPEN, our open-source implementation of the state-of-the-art static logic bomb scanner TRIGGERSCOPE, on more than 500k Android applications. Results indicate that the approach scales. Moreover, we investigate the discrepancies and show that the approach can reach a very low false-positive rate, 0.3%, but at a particular cost, e.g., removing 90% of sensitive methods. Therefore, it might not be realistic to rely on such an approach to automatically detect all logic bombs in large datasets. However, it could be used to speed up the location of malicious code, for instance, while reverse engineering applications. We also present TRIGDB a database of 68 Android applications containing trigger-based behavior as a ground-truth to the research community.
The Android OS has become the most popular mobile operating system leading to a significant increase in the spread of Android malware. Consequently, several static and dynamic analysis systems have been developed to detect Android malware. With dynamic analysis, efficient test input generation is needed in order to trigger the potential run-time malicious behaviours. Most existing dynamic analysis systems employ random-based input generation methods usually built using the Android Monkey tool. Random-based input generation has several shortcomings including limited code coverage, which motivates us to explore combining it with a state-based method in order to improve efficiency. Hence, in this paper, we present a novel hybrid test input generation approach designed to improve dynamic analysis on real devices. We implemented the hybrid system by integrating a random based tool (Monkey) with a state based tool (DroidBot) in order to improve code coverage and potentially uncover more malicious behaviours. The system is evaluated using 2,444 Android apps containing 1222 benign and 1222 malware samples from the Android malware genome project. Three scenarios, random only, state-based only, and our proposed hybrid approach were investigated to comparatively evaluate their performances. Our study shows that the hybrid approach significantly improved the amount of dynamic features extracted from both benign and malware samples over the state-based and commonly used random test input generation method.
Mobile application security has been one of the major areas of security research in the last decade. Numerous application analysis tools have been proposed in response to malicious, curious, or vulnerable apps. However, existing tools, and specifically, static analysis tools, trade soundness of the analysis for precision and performance, and are hence soundy. Unfortunately, the specific unsound choices or flaws in the design of these tools are often not known or well-documented, leading to a misplaced confidence among researchers, developers, and users. This paper proposes the Mutation-based soundness evaluation ($mu$SE) framework, which systematically evaluates Android static analysis tools to discover, document, and fix, flaws, by leveraging the well-founded practice of mutation analysis. We implement $mu$SE as a semi-automated framework, and apply it to a set of prominent Android static analysis tools that detect private data leaks in apps. As the result of an in-depth analysis of one of the major tools, we discover 13 undocumented flaws. More importantly, we discover that all 13 flaws propagate to tools that inherit the flawed tool. We successfully fix one of the flaws in cooperation with the tool developers. Our results motivate the urgent need for systematic discovery and documentation of unsound choices in soundy tools, and demonstrate the opportunities in leveraging mutation testing in achieving this goal.