ANDRUSPEX : Leveraging Graph Representation Learning to Predict Harmful App Installations on Mobile Devices


Abstract in English

Androids security model severely limits the capabilities of anti-malware software. Unlike commodity anti-malware solutions on desktop systems, their Android counterparts run as sandboxed applications without root privileges and are limited by Androids permission system. As such, PHAs on Android are usually willingly installed by victims, as they come disguised as useful applications with hidden malicious functionality, and are encountered on mobile app stores as suggestions based on the apps that a user previously installed. Users with similar interests and app installation history are likely to be exposed and to decide to install the same PHA. This observation gives us the opportunity to develop predictive approaches that can warn the user about which PHAs they will encounter and potentially be tempted to install in the near future. These approaches could then be used to complement commodity anti-malware solutions, which are focused on post-fact detection, closing the window of opportunity that existing solutions suffer from. In this paper we develop Andruspex, a system based on graph representation learning, allowing us to learn latent relationships between user devices and PHAs and leverage them for prediction. We test Andruspex on a real world dataset of PHA installations collected by a security company, and show that our approach achieves very high prediction results (up to 0.994 TPR at 0.0001 FPR), while at the same time outperforming alternative baseline methods. We also demonstrate that Andruspex is robust and its runtime performance is acceptable for a real world deployment.

Download