ترغب بنشر مسار تعليمي؟ اضغط هنا

Can one hear the shape of a neural network?: Snooping the GPU via Magnetic Side Channel

117   0   0.0 ( 0 )
 نشر من قبل Henrique Maia
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Neural network applications have become popular in both enterprise and personal settings. Network solutions are tuned meticulously for each task, and designs that can robustly resolve queries end up in high demand. As the commercial value of accurate and performant machine learning models increases, so too does the demand to protect neural architectures as confidential investments. We explore the vulnerability of neural networks deployed as black boxes across accelerated hardware through electromagnetic side channels. We examine the magnetic flux emanating from a graphics processing units power cable, as acquired by a cheap $3 induction sensor, and find that this signal betrays the detailed topology and hyperparameters of a black-box neural network model. The attack acquires the magnetic signal for one query with unknown input values, but known input dimensions. The network reconstruction is possible due to the modular layer sequence in which deep neural networks are evaluated. We find that each layer components evaluation produces an identifiable magnetic signal signature, from which layer topology, width, function type, and sequence order can be inferred using a suitably trained classifier and a joint consistency optimization based on integer programming. We study the extent to which network specifications can be recovered, and consider metrics for comparing network similarity. We demonstrate the potential accuracy of this side channel attack in recovering the details for a broad range of network architectures, including random designs. We consider applications that may exploit this novel side channel exposure, such as adversarial transfer attacks. In response, we discuss countermeasures to protect against our method and other similar snooping techniques.



قيم البحث

اقرأ أيضاً

We show that the spectrum of the Schrodinger operator on a finite, metric graph determines uniquely the connectivity matrix and the bond lengths, provided that the lengths are non-commensurate and the connectivity is simple (no parallel bonds between vertices and no loops connecting a vertex to itself). That is, one can hear the shape of the graph! We also consider a related inversion problem: A compact graph can be converted into a scattering system by attaching to its vertices leads to infinity. We show that the scattering phase determines uniquely the compact part of the graph, under similar conditions as above.
Traditionally, network analysis is based on local properties of vertices, like their degree or clustering, and their statistical behavior across the network in question. This paper develops an approach which is different in two respects. We investiga te edge-based properties, and we define global characteristics of networks directly. The latter will provide our affirmative answer to the question raised in the title. More concretely, we start with Formans notion of the Ricci curvature of a graph, or more generally, a polyhedral complex. This will allow us to pass from a graph as representing a network to a polyhedral complex for instance by filling in triangles into connected triples of edges and to investigate the resulting effect on the curvature. This is insightful for two reasons: First, we can define a curvature flow in order to asymptotically simplify a network and reduce it to its essentials. Second, using a construction of Bloch, which yields a discrete Gauss-Bonnet theorem, we have the Euler characteristic of a network as a global characteristic. These two aspects beautifully merge in the sense that the asymptotic properties of the curvature flow are indicated by that Euler characteristic.
GPUs are increasingly being used in security applications, especially for accelerating encryption/decryption. While GPUs are an attractive platform in terms of performance, the security of these devices raises a number of concerns. One vulnerability is the data-dependent timing information, which can be exploited by adversary to recover the encryption key. Memory system features are frequently exploited since they create detectable timing variations. In this paper, our attack model is a coalescing attack, which leverages a critical GPU microarchitectural feature -- the coalescing unit. As multiple concurrent GPU memory requests can refer to the same cache block, the coalescing unit collapses them into a single memory transaction. The access time of an encryption kernel is dependent on the number of transactions. Correlation between a guessed key value and the associated timing samples can be exploited to recover the secret key. In this paper, a series of hardware/software countermeasures are proposed to obfuscate the memory timing side channel, making the GPU more resilient without impacting performance. Our hardware-based approach attempts to randomize the width of the coalescing unit to lower the signal-to-noise ratio. We present a hierarchical Miss Status Holding Register (MSHR) design that can merge transactions across different warps. This feature boosts performance, while, at the same time, secures the execution. We also present a software-based approach to permute the organization of critical data structures, significantly changing the coalescing behavior and introducing a high degree of randomness. Equipped with our new protections, the effort to launch a successful attack is increased up to 1433X . 178X, while also improving encryption/decryption performance up to 7%.
Deep learning is gaining importance in many applications. However, Neural Networks face several security and privacy threats. This is particularly significant in the scenario where Cloud infrastructures deploy a service with Neural Network model at t he back end. Here, an adversary can extract the Neural Network parameters, infer the regularization hyperparameter, identify if a data point was part of the training data, and generate effective transferable adversarial examples to evade classifiers. This paper shows how a Neural Network model is susceptible to timing side channel attack. In this paper, a black box Neural Network extraction attack is proposed by exploiting the timing side channels to infer the depth of the network. Although, constructing an equivalent architecture is a complex search problem, it is shown how Reinforcement Learning with knowledge distillation can effectively reduce the search space to infer a target model. The proposed approach has been tested with VGG architectures on CIFAR10 data set. It is observed that it is possible to reconstruct substitute models with test accuracy close to the target models and the proposed approach is scalable and independent of type of Neural Network architectures.
Recent work has introduced attacks that extract the architecture information of deep neural networks (DNN), as this knowledge enhances an adversarys capability to conduct black-box attacks against the model. This paper presents the first in-depth sec urity analysis of DNN fingerprinting attacks that exploit cache side-channels. First, we define the threat model for these attacks: our adversary does not need the ability to query the victim model; instead, she runs a co-located process on the host machine victims deep learning (DL) system is running and passively monitors the accesses of the target functions in the shared framework. Second, we introduce DeepRecon, an attack that reconstructs the architecture of the victim network by using the internal information extracted via Flush+Reload, a cache side-channel technique. Once the attacker observes function invocations that map directly to architecture attributes of the victim network, the attacker can reconstruct the victims entire network architecture. In our evaluation, we demonstrate that an attacker can accurately reconstruct two complex networks (VGG19 and ResNet50) having observed only one forward propagation. Based on the extracted architecture attributes, we also demonstrate that an attacker can build a meta-model that accurately fingerprints the architecture and family of the pre-trained model in a transfer learning setting. From this meta-model, we evaluate the importance of the observed attributes in the fingerprinting process. Third, we propose and evaluate new framework-level defense techniques that obfuscate our attackers observations. Our empirical security analysis represents a step toward understanding the DNNs vulnerability to cache side-channel attacks.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا