No Arabic abstract
In this paper, we propose a game theoretical adversarial intervention detection mechanism for reliable smart road signs. A future trend in intelligent transportation systems is ``smart road signs that incorporate smart codes (e.g., visible at infrared) on their surface to provide more detailed information to smart vehicles. Such smart codes make road sign classification problem aligned with communication settings more than conventional classification. This enables us to integrate well-established results in communication theory, e.g., error-correction methods, into road sign classification problem. Recently, vision-based road sign classification algorithms have been shown to be vulnerable against (even) small scale adversarial interventions that are imperceptible for humans. On the other hand, smart codes constructed via error-correction methods can lead to robustness against small scale intelligent or random perturbations on them. In the recognition of smart road signs, however, humans are out of the loop since they cannot see or interpret them. Therefore, there is no equivalent concept of imperceptible perturbations in order to achieve a comparable performance with humans. Robustness against small scale perturbations would not be sufficient since the attacker can attack more aggressively without such a constraint. Under a game theoretical solution concept, we seek to ensure certain measure of guarantees against even the worst case (intelligent) attackers that can perturb the signal even at large scale. We provide a randomized detection strategy based on the distance between the decoder output and the received input, i.e., error rate. Finally, we examine the performance of the proposed scheme over various scenarios.
Smart Meters (SMs) are a fundamental component of smart grids, but they carry sensitive information about users such as occupancy status of houses and therefore, they have raised serious concerns about leakage of consumers private information. In particular, we focus on real-time privacy threats, i.e., potential attackers that try to infer sensitive data from SMs reported data in an online fashion. We adopt an information-theoretic privacy measure and show that it effectively limits the performance of any real-time attacker. Using this privacy measure, we propose a general formulation to design a privatization mechanism that can provide a target level of privacy by adding a minimal amount of distortion to the SMs measurements. On the other hand, to cope with different applications, a flexible distortion measure is considered. This formulation leads to a general loss function, which is optimized using a deep learning adversarial framework, where two neural networks $-$ referred to as the releaser and the adversary $-$ are trained with opposite goals. An exhaustive empirical study is then performed to validate the performances of the proposed approach for the occupancy detection privacy problem, assuming the attacker disposes of either limited or full access to the training dataset.
Early risk diagnosis and driving anomaly detection from vehicle stream are of great benefits in a range of advanced solutions towards Smart Road and crash prevention, although there are intrinsic challenges, especially lack of ground truth, definition of multiple risk exposures. This study proposes a domain-specific automatic clustering (termed Autocluster) to self-learn the optimal models for unsupervised risk assessment, which integrates key steps of risk clustering into an auto-optimisable pipeline, including feature and algorithm selection, hyperparameter auto-tuning. Firstly, based on surrogate conflict measures, indicator-guided feature extraction is conducted to construct temporal-spatial and kinematical risk features. Then we develop an elimination-based model reliance importance (EMRI) method to unsupervised-select the useful features. Secondly, we propose balanced Silhouette Index (bSI) to evaluate the internal quality of imbalanced clustering. A loss function is designed that considers the clustering performance in terms of internal quality, inter-cluster variation, and model stability. Thirdly, based on Bayesian optimisation, the algorithm selection and hyperparameter auto-tuning are self-learned to generate the best clustering partitions. Various algorithms are comprehensively investigated. Herein, NGSIM vehicle trajectory data is used for test-bedding. Findings show that Autocluster is reliable and promising to diagnose multiple distinct risk exposures inherent to generalised driving behaviour. Besides, we also delve into risk clustering, such as, algorithms heterogeneity, Silhouette analysis, hierarchical clustering flows, etc. Meanwhile, the Autocluster is also a method for unsupervised multi-risk data labelling and indicator threshold calibration. Furthermore, Autocluster is useful to tackle the challenges in imbalanced clustering without ground truth or priori knowledge
How to train a machine learning model while keeping the data private and secure? We present CodedPrivateML, a fast and scalable approach to this critical problem. CodedPrivateML keeps both the data and the model information-theoretically private, while allowing efficient parallelization of training across distributed workers. We characterize CodedPrivateMLs privacy threshold and prove its convergence for logistic (and linear) regression. Furthermore, via extensive experiments on Amazon EC2, we demonstrate that CodedPrivateML provides significant speedup over cryptographic approaches based on multi-party computing (MPC).
Many recent works have shown that adversarial examples that fool classifiers can be found by minimally perturbing a normal input. Recent theoretical results, starting with Gilmer et al. (2018b), show that if the inputs are drawn from a concentrated metric probability space, then adversarial examples with small perturbation are inevitable. A concentrated space has the property that any subset with $Omega(1)$ (e.g., 1/100) measure, according to the imposed distribution, has small distance to almost all (e.g., 99/100) of the points in the space. It is not clear, however, whether these theoretical results apply to actual distributions such as images. This paper presents a method for empirically measuring and bounding the concentration of a concrete dataset which is proven to converge to the actual concentration. We use it to empirically estimate the intrinsic robustness to $ell_infty$ and $ell_2$ perturbations of several image classification benchmarks. Code for our experiments is available at https://github.com/xiaozhanguva/Measure-Concentration.
Outsourcing neural network inference tasks to an untrusted cloud raises data privacy and integrity concerns. To address these challenges, several privacy-preserving and verifiable inference techniques have been proposed based on replacing the non-polynomial activation functions such as the rectified linear unit (ReLU) function with polynomial activation functions. Such techniques usually require polynomials with integer coefficients or polynomials over finite fields. Motivated by such requirements, several works proposed replacing the ReLU activation function with the square activation function. In this work, we empirically show that the square function is not the best degree-$2$ polynomial that can replace the ReLU function even when restricting the polynomials to have integer coefficients. We instead propose a degree-$2$ polynomial activation function with a first order term and empirically show that it can lead to much better models. Our experiments on the CIFAR-$10$ and CIFAR-$100$ datasets on various architectures show that our proposed activation function improves the test accuracy by up to $9.4%$ compared to the square function.