No Arabic abstract
802.11 device fingerprinting is the action of characterizing a target device through its wireless traffic. This results in a signature that may be used for identification, network monitoring or intrusion detection. The fingerprinting method can be active by sending traffic to the target device, or passive by just observing the traffic sent by the target device. Many passive fingerprinting methods rely on the observation of one particular network feature, such as the rate switching behavior or the transmission pattern of probe requests. In this work, we evaluate a set of global wireless network parameters with respect to their ability to identify 802.11 devices. We restrict ourselves to parameters that can be observed passively using a standard wireless card. We evaluate these parameters for two different tests: i) the identification test that returns one single result being the closest match for the target device, and ii) the similarity test that returns a set of devices that are close to the target devices. We find that the network parameters transmission time and frame inter-arrival time perform best in comparison to the other network parameters considered. Finally, we focus on inter-arrival times, the most promising parameter for device identification, and show its dependency from several device characteristics such as the wireless card and driver but also running applications.
While the Internet of Things (IoT) can benefit from machine learning by outsourcing model training on the cloud, user data exposure to an untrusted cloud service provider can pose threat to user privacy. Recently, federated learning is proposed as an approach for privacy-preserving machine learning (PPML) for the IoT, while its practicability remains unclear. This work presents the evaluation on the efficiency and privacy performance of a readily available federated learning framework based on PySyft, a Python library for distributed deep learning. It is observed that the training speed of the framework is significantly slower than of the centralized approach due to communication overhead. Meanwhile, the framework bears some vulnerability to potential man-in-the-middle attacks at the network level. The report serves as a starting point for PPML performance analysis and suggests the future direction for PPML framework development.
In the proof-of-stake (PoS) paradigm for maintaining decentralized, permissionless cryptocurrencies, Sybil attacks are prevented by basing the distribution of roles in the protocol execution on the stake distribution recorded in the ledger itself. However, for various reasons this distribution cannot be completely up-to-date, introducing a gap between the present stake distribution, which determines the parties current incentives, and the one used by the protocol. In this paper, we investigate this issue, and empirically quantify its effects. We survey existing provably secure PoS proposals to observe that the above time gap between the two stake distributions, which we call stake distribution lag, amounts to several days for each of these protocols. Based on this, we investigate the ledgers of four major cryptocurrencies (Bitcoin, Bitcoin Cash, Litecoin and Zcash) and compute the average stake shift (the statistical distance of the two distributions) for each value of stake distribution lag between 1 and 14 days, as well as related statistics. We also empirically quantify the sublinear growth of stake shift with the length of the considered lag interval. Finally, we turn our attention to unusual stake-shift spikes in these currencies: we observe that hard forks trigger major stake shifts and that single real-world actors, mostly exchanges, account for major stake shifts in established cryptocurrency ecosystems.
Information flow analysis prevents secret or untrusted data from flowing into public or trusted sinks. Existing mechanisms cover a wide array of options, ranging from lightweight taint analysis to heavyweight information flow control that also considers implicit flows. Dynamic analysis, which is particularly popular for languages such as JavaScript, faces the question whether to invest in analyzing flows caused by not executing a particular branch, so-called hidden implicit flows. This paper addresses the questions how common different kinds of flows are in real-world programs, how important these flows are to enforce security policies, and how costly it is to consider these flows. We address these questions in an empirical study that analyzes 56 real-world JavaScript programs that suffer from various security problems, such as code injection vulnerabilities, denial of service vulnerabilities, memory leaks, and privacy leaks. The study is based on a state-of-the-art dynamic information flow analysis and a formalization of its core. We find that implicit flows are expensive to track in terms of permissiveness, label creep, and runtime overhead. We find a lightweight taint analysis to be sufficient for most of the studied security problems, while for some privacy-related code, observable tracking is sometimes required. In contrast, we do not find any evidence that tracking hidden implicit flows reveals otherwise missed security problems. Our results help security analysts and analysis designers to understand the cost-benefit tradeoffs of information flow analysis and provide empirical evidence that analyzing implicit flows in a cost-effective way is a relevant problem.
The prominence and use of the concept of cyber risk has been rising in recent years. This paper presents empirical investigations focused on two important and distinct groups within the broad community of cyber-defense professionals and researchers: (1) cyber practitioners and (2) developers of cyber ontologies. The key finding of this work is that the ways the concept of cyber risk is treated by practitioners of cybersecurity is largely inconsistent with definitions of cyber risk commonly offered in the literature. Contrary to commonly cited definitions of cyber risk, concepts such as the likelihood of an event and the extent of its impact are not used by cybersecurity practitioners. This is also the case for use of these concepts in the current generation of cybersecurity ontologies. Instead, terms and concepts reflective of the adversarial nature of cyber defense appear to take the most prominent roles. This research offers the first quantitative empirical evidence that rejection of traditional concepts of cyber risk by cybersecurity professionals is indeed observed in real-world practice.
In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., sensors, actuators) and detect potential misbehavior that may arise due to cyberattacks, system faults, or misconfigurations. With this goal in mind, a promising research field emerged focusing on creating and managing fingerprints that model the behavior of both the device actions and its components. The article at hand studies the recent growth of the device behavior fingerprinting field in terms of application scenarios, behavioral sources, and processing and evaluation techniques. First, it performs a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection. After that, each work is deeply analyzed and compared, emphasizing its characteristics, advantages, and limitations. This article also provides researchers with a review of the most relevant characteristics of existing datasets as most of the novel processing techniques are based on machine learning and deep learning. Finally, it studies the evolution of these two scenarios in recent years, providing lessons learned, current trends, and future research challenges to guide new solutions in the area.