Early Stage Malware Prediction Using Recurrent Neural Networks


Abstract in English

Static malware analysis is well-suited to endpoint anti-virus systems as it can be conducted quickly by examining the features of an executable piece of code and matching it to previously observed malicious code. However, static code analysis can be vulnerable to code obfuscation techniques. Behavioural data collected during file execution is more difficult to obfuscate, but takes a relatively long time to capture - typically up to 5 minutes, meaning the malicious payload has likely already been delivered by the time it is detected. In this paper we investigate the possibility of predicting whether or not an executable is malicious based on a short snapshot of behavioural data. We find that an ensemble of recurrent neural networks are able to predict whether an executable is malicious or benign within the first 5 seconds of execution with 94% accuracy. This is the first time general types of malicious file have been predicted to be malicious during execution rather than using a complete activity log file post-execution, and enables cyber security endpoint protection to be advanced to use behavioural data for blocking malicious payloads rather than detecting them post-execution and having to repair the damage.

Download