No Arabic abstract
Cyber Physical Systems (CPS) are characterized by their ability to integrate the physical and information or cyber worlds. Their deployment in critical infrastructure have demonstrated a potential to transform the world. However, harnessing this potential is limited by their critical nature and the far reaching effects of cyber attacks on human, infrastructure and the environment. An attraction for cyber concerns in CPS rises from the process of sending information from sensors to actuators over the wireless communication medium, thereby widening the attack surface. Traditionally, CPS security has been investigated from the perspective of preventing intruders from gaining access to the system using cryptography and other access control techniques. Most research work have therefore focused on the detection of attacks in CPS. However, in a world of increasing adversaries, it is becoming more difficult to totally prevent CPS from adversarial attacks, hence the need to focus on making CPS resilient. Resilient CPS are designed to withstand disruptions and remain functional despite the operation of adversaries. One of the dominant methodologies explored for building resilient CPS is dependent on machine learning (ML) algorithms. However, rising from recent research in adversarial ML, we posit that ML algorithms for securing CPS must themselves be resilient. This paper is therefore aimed at comprehensively surveying the interactions between resilient CPS using ML and resilient ML when applied in CPS. The paper concludes with a number of research trends and promising future research directions. Furthermore, with this paper, readers can have a thorough understanding of recent advances on ML-based security and securing ML for CPS and countermeasures, as well as research trends in this active research area.
Adversarial attacks for machine learning models have become a highly studied topic both in academia and industry. These attacks, along with traditional security threats, can compromise confidentiality, integrity, and availability of organizations assets that are dependent on the usage of machine learning models. While it is not easy to predict the types of new attacks that might be developed over time, it is possible to evaluate the risks connected to using machine learning models and design measures that help in minimizing these risks. In this paper, we outline a novel framework to guide the risk management process for organizations reliant on machine learning models. First, we define sets of evaluation factors (EFs) in the data domain, model domain, and security controls domain. We develop a method that takes the asset and task importance, sets the weights of EFs contribution to confidentiality, integrity, and availability, and based on implementation scores of EFs, it determines the overall security state in the organization. Based on this information, it is possible to identify weak links in the implemented security measures and find out which measures might be missing completely. We believe our framework can help in addressing the security issues related to usage of machine learning models in organizations and guide them in focusing on the adequate security measures to protect their assets.
In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models, while obeying the laws and regulations and ensuring data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. We propose a functional architecture of federated learning systems and a taxonomy of related techniques. Furthermore, we present the distributed training, data communication, and security of FL systems. Finally, we analyze their limitations and propose future research directions.
Machine learning techniques are currently used extensively for automating various cybersecurity tasks. Most of these techniques utilize supervised learning algorithms that rely on training the algorithm to classify incoming data into different categories, using data encountered in the relevant domain. A critical vulnerability of these algorithms is that they are susceptible to adversarial attacks where a malicious entity called an adversary deliberately alters the training data to misguide the learning algorithm into making classification errors. Adversarial attacks could render the learning algorithm unsuitable to use and leave critical systems vulnerable to cybersecurity attacks. Our paper provides a detailed survey of the state-of-the-art techniques that are used to make a machine learning algorithm robust against adversarial attacks using the computational framework of game theory. We also discuss open problems and challenges and possible directions for further research that would make deep machine learning-based systems more robust and reliable for cybersecurity tasks.
The Internet of Things (IoT) is becoming an indispensable part of everyday life, enabling a variety of emerging services and applications. However, the presence of rogue IoT devices has exposed the IoT to untold risks with severe consequences. The first step in securing the IoT is detecting rogue IoT devices and identifying legitimate ones. Conventional approaches use cryptographic mechanisms to authenticate and verify legitimate devices identities. However, cryptographic protocols are not available in many systems. Meanwhile, these methods are less effective when legitimate devices can be exploited or encryption keys are disclosed. Therefore, non-cryptographic IoT device identification and rogue device detection become efficient solutions to secure existing systems and will provide additional protection to systems with cryptographic protocols. Non-cryptographic approaches require more effort and are not yet adequately investigated. In this paper, we provide a comprehensive survey on machine learning technologies for the identification of IoT devices along with the detection of compromised or falsified ones from the viewpoint of passive surveillance agents or network operators. We classify the IoT device identification and detection into four categories: device-specific pattern recognition, Deep Learning enabled device identification, unsupervised device identification, and abnormal device detection. Meanwhile, we discuss various ML-related enabling technologies for this purpose. These enabling technologies include learning algorithms, feature engineering on network traffic traces and wireless signals, continual learning, and abnormality detection.
Machine Learning (ML) has been widely applied to cybersecurity, and is currently considered state-of-the-art for solving many of the fields open issues. However, it is very difficult to evaluate how good the produced solutions are, since the challenges faced in security may not appear in other areas (at least not in the same way). One of these challenges is the concept drift, that actually creates an arms race between attackers and defenders, given that any attacker may create novel, different threats as time goes by (to overcome defense solutions) and this evolution is not always considered in many works. Due to this type of issue, it is fundamental to know how to correctly build and evaluate a ML-based security solution. In this work, we list, detail, and discuss some of the challenges of applying ML to cybersecurity, including concept drift, concept evolution, delayed labels, and adversarial machine learning. We also show how existing solutions fail and, in some cases, we propose possible solutions to fix them.