No Arabic abstract
The perpetual opposition between antiviruses and malware leads both parties to evolve continuously. On the one hand, antiviruses put in place solutions that are more and more sophisticated and propose more complex detection techniques in addition to the classic signature analysis. This sophistication leads antiviruses to leave more traces of their presence on the machine they protect. To remain undetected as long as possible, malware can avoid executing within such environments by hunting down the modifications left by the antiviruses. This paper aims at determining the possibilities for malware to detect the antiviruses and then evaluating the efficiency of these techniques on a panel of antiviruses that are the most used nowadays. We then collect samples showing this kind of behavior and propose to evaluate a countermeasure that creates false artifacts, thus forcing malware to evade.
Malware is a piece of software that was written with the intent of doing harm to data, devices, or people. Since a number of new malware variants can be generated by reusing codes, malware attacks can be easily launched and thus become common in recent years, incurring huge losses in businesses, governments, financial institutes, health providers, etc. To defeat these attacks, malware classification is employed, which plays an essential role in anti-virus products. However, existing works that employ either static analysis or dynamic analysis have major weaknesses in complicated reverse engineering and time-consuming tasks. In this paper, we propose a visualized malware classification framework called VisMal, which provides highly efficient categorization with acceptable accuracy. VisMal converts malware samples into images and then applies a contrast-limited adaptive histogram equalization algorithm to enhance the similarity between malware image regions in the same family. We provided a proof-of-concept implementation and carried out an extensive evaluation to verify the performance of our framework. The evaluation results indicate that VisMal can classify a malware sample within 5.2ms and have an average accuracy of 96.0%. Moreover, VisMal provides security engineers with a simple visualization approach to further validate its performance.
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning. According to the attackers capability and affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into six categorizations: code poisoning, outsourcing, pretrained, data collection, collaborative learning and post-deployment. Accordingly, attacks under each categorization are combed. The countermeasures are categorized into four general classes: blind backdoor removal, offline backdoor inspection, online backdoor inspection, and post backdoor removal. Accordingly, we review countermeasures, and compare and analyze their advantages and disadvantages. We have also reviewed the flip side of backdoor attacks, which are explored for i) protecting intellectual property of deep learning models, ii) acting as a honeypot to catch adversarial example attacks, and iii) verifying data deletion requested by the data contributor.Overall, the research on defense is far behind the attack, and there is no single defense that can prevent all types of backdoor attacks. In some cases, an attacker can intelligently bypass existing defenses with an adaptive attack. Drawing the insights from the systematic review, we also present key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and in particular, more efficient and practical countermeasures are solicited.
Although state-of-the-art PDF malware classifiers can be trained with almost perfect test accuracy (99%) and extremely low false positive rate (under 0.1%), it has been shown that even a simple adversary can evade them. A practically useful malware classifier must be robust against evasion attacks. However, achieving such robustness is an extremely challenging task. In this paper, we take the first steps towards training robust PDF malware classifiers with verifiable robustness properties. For instance, a robustness property can enforce that no matter how many pages from benign documents are inserted into a PDF malware, the classifier must still classify it as malicious. We demonstrate how the worst-case behavior of a malware classifier with respect to specific robustness properties can be formally verified. Furthermore, we find that training classifiers that satisfy formally verified robustness properties can increase the evasion cost of unbounded (i.e., not bounded by the robustness properties) attackers by eliminating simple evasion attacks. Specifically, we propose a new distance metric that operates on the PDF tree structure and specify two classes of robustness properties including subtree insertions and deletions. We utilize state-of-the-art verifiably robust training method to build robust PDF malware classifiers. Our results show that, we can achieve 92.27% average verified robust accuracy over three properties, while maintaining 99.74% accuracy and 0.56% false positive rate. With simple robustness properties, our robust model maintains 7% higher robust accuracy than all the baseline models against unrestricted whitebox attacks. Moreover, the state-of-the-art and new adaptive evolutionary attackers need up to 10 times larger $L_0$ feature distance and 21 times more PDF basic mutations (e.g., inserting and deleting objects) to evade our robust model than the baselines.
Increasing numbers of mobile computing devices, user-portable, or embedded in vehicles, cargo containers, or the physical space, need to be aware of their location in order to provide a wide range of commercial services. Most often, mobile devices obtain their own location with the help of Global Navigation Satellite Systems (GNSS), integrating, for example, a Global Positioning System (GPS) receiver. Nonetheless, an adversary can compromise location-aware applications by attacking the GNSS-based positioning: It can forge navigation messages and mislead the receiver into calculating a fake location. In this paper, we analyze this vulnerability and propose and evaluate the effectiveness of countermeasures. First, we consider replay attacks, which can be effective even in the presence of future cryptographic GNSS protection mechanisms. Then, we propose and analyze methods that allow GNSS receivers to detect the reception of signals generated by an adversary, and then reject fake locations calculated because of the attack. We consider three diverse defense mechanisms, all based on knowledge, in particular, own location, time, and Doppler shift, receivers can obtain prior to the onset of an attack. We find that inertial mechanisms that estimate location can be defeated relatively easy. This is equally true for the mechanism that relies on clock readings from off-the-shelf devices; as a result, highly stable clocks could be needed. On the other hand, our Doppler Shift Test can be effective without any specialized hardware, and it can be applied to existing devices.
Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we propose a neural network framework composed of an embedder, an encoder, and a filter to learn malware representations from characteristic execution sequences for malware family classification. The embedder uses BERT and Sent2Vec, state-of-the-art embedding modules, to capture relations within a single API call and among consecutive API calls in an execution trace. The encoder comprises gated recurrent units (GRU) to preserve the ordinal position of API calls and a self-attention mechanism for comparing intra-relations among different positions of API calls. The filter identifies representative API calls to build the malware representation. We conduct broad experiments to determine the influence of individual framework components. The results show that the proposed framework outperforms the baselines, and also demonstrates that considering Sent2Vec to learn complete API call embeddings and GRU to explicitly preserve ordinal information yields more information and thus significant improvements. Also, the proposed approach effectively classifies new malicious execution traces on the basis of similarities with previously collected families.