No Arabic abstract
The number of Android malware variants (clones) are on the rise and, to stop this attack of clones we need to develop new methods and techniques for analysing and detecting them. As a first step, we need to study how these malware clones are generated. This will help us better anticipate and recognize these clones. In this paper we present a new tool named DroidMorph, that provides morphing of Android applications (APKs) at different level of abstractions, and can be used to create Android application (malware/benign) clones. As a case study we perform testing and evaluating resilience of current commercial anti-malware products against attack of the Android malware clones generated by DroidMorph. We found that 8 out of 17 leading commercial anti-malware programs were not able to detect any of the morphed APKs. We hope that DroidMorph will be used in future research, to improve Android malware clones analysis and detection, and help stop them.
Machine learning (ML) classifiers are vulnerable to adversarial examples. An adversarial example is an input sample which is slightly modified to induce misclassification in an ML classifier. In this work, we investigate white-box and grey-box evasion attacks to an ML-based malware detector and conduct performance evaluations in a real-world setting. We compare the defense approaches in mitigating the attacks. We propose a framework for deploying grey-box and black-box attacks to malware detection systems.
According to the Symantec and F-Secure threat reports, mobile malware development in 2013 and 2014 has continued to focus almost exclusively ~99% on the Android platform. Malware writers are applying stealthy mutations (obfuscations) to create malware variants, thwarting detection by signature based detectors. In addition, the plethora of more sophisticated detectors making use of static analysis techniques to detect such variants operate only at the bytecode level, meaning that malware embedded in native code goes undetected. A recent study shows that 86% of the most popular Android applications contain native code, making this a plausible threat. This paper proposes DroidNative, an Android malware detector that uses specific control flow patterns to reduce the effect of obfuscations, provides automation and platform independence, and as far as we know is the first system that operates at the Android native code level, allowing it to detect malware embedded in both native code and bytecode. When tested with traditional malware variants it achieves a detection rate (DR) of 99.48%, compared to academic and commercial tools DRs that range from 8.33% -- 93.22%. When tested with a dataset of 2240 samples DroidNative achieves a DR of 99.16%, a false positive rate of 0.4% and an average detection time of 26.87 sec/sample.
Android malware has been on the rise in recent years due to the increasing popularity of Android and the proliferation of third party application markets. Emerging Android malware families are increasingly adopting sophisticated detection avoidance techniques and this calls for more effective approaches for Android malware detection. Hence, in this paper we present and evaluate an n-gram opcode features based approach that utilizes machine learning to identify and categorize Android malware. This approach enables automated feature discovery without relying on prior expert or domain knowledge for pre-determined features. Furthermore, by using a data segmentation technique for feature selection, our analysis is able to scale up to 10-gram opcodes. Our experiments on a dataset of 2520 samples showed an f-measure of 98% using the n-gram opcode based approach. We also provide empirical findings that illustrate factors that have probable impact on the overall n-gram opcodes performance trends.
We present BPFroid -- a novel dynamic analysis framework for Android that uses the eBPF technology of the Linux kernel to continuously monitor events of user applications running on a real device. The monitored events are collected from different components of the Android software stack: internal kernel functions, system calls, native library functions, and the Java API framework. As BPFroid hooks these events in the kernel, a malware is unable to trivially bypass monitoring. Moreover, using eBPF doesnt require any change to the Android system or the monitored applications. We also present an analytical comparison of BPFroid to other malware detection methods and demonstrate its usage by developing novel signatures to detect suspicious behavior that are based on it. These signatures are then evaluated using real apps. We also demonstrate how BPFroid can be used to capture forensic artifacts for further investigation. Our results show that BPFroid successfully alerts in real time when a suspicious behavioral signature is detected, without incurring a significant runtime performance overhead.
With the growth of mobile devices and applications, the number of malicious software, or malware, is rapidly increasing in recent years, which calls for the development of advanced and effective malware detection approaches. Traditional methods such as signature-based ones cannot defend users from an increasing number of new types of malware or rapid malware behavior changes. In this paper, we propose a new Android malware detection approach based on deep learning and static analysis. Instead of using Application Programming Interfaces (APIs) only, we further analyze the source code of Android applications and create their higher-level graphical semantics, which makes it harder for attackers to evade detection. In particular, we use a call graph from method invocations in an Android application to represent the application, and further analyze method attributes to form a structured Program Representation Graph (PRG) with node attributes. Then, we use a graph convolutional network (GCN) to yield a graph representation of the application by embedding the entire graph into a dense vector, and classify whether it is a malware or not. To efficiently train such a graph convolutional network, we propose a batch training scheme that allows multiple heterogeneous graphs to be input as a batch. To the best of our knowledge, this is the first work to use graph representation learning for malware detection. We conduct extensive experiments from real-world sample collections and demonstrate that our developed system outperforms multiple other existing malware detection techniques.