ترغب بنشر مسار تعليمي؟ اضغط هنا

Verification of Non-Linear Specifications for Neural Networks

701   0   0.0 ( 0 )
 نشر من قبل Chongli Qin
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

Prior work on neural network verification has focused on specifications that are linear functions of the output of the network, e.g., invariance of the classifier output under adversarial perturbations of the input. In this paper, we extend verification algorithms to be able to certify richer properties of neural networks. To do this we introduce the class of convex-relaxable specifications, which constitute nonlinear specifications that can be verified using a convex relaxation. We show that a number of important properties of interest can be modeled within this class, including conservation of energy in a learned dynamics model of a physical system; semantic consistency of a classifiers output labels under adversarial perturbations and bounding errors in a system that predicts the summation of handwritten digits. Our experimental evaluation shows that our method is able to effectively verify these specifications. Moreover, our evaluation exposes the failure modes in models which cannot be verified to satisfy these specifications. Thus, emphasizing the importance of training models not just to fit training data but also to be consistent with specifications.

قيم البحث

اقرأ أيضاً

Despite the functional success of deep neural networks (DNNs), their trustworthiness remains a crucial open challenge. To address this challenge, both testing and verification techniques have been proposed. But these existing techniques provide eithe r scalability to large networks or formal guarantees, not both. In this paper, we propose a scalable quantitative verification framework for deep neural networks, i.e., a test-driven approach that comes with formal guarantees that a desired probabilistic property is satisfied. Our technique performs enough tests until soundness of a formal probabilistic property can be proven. It can be used to certify properties of both deterministic and randomized DNNs. We implement our approach in a tool called PROVERO and apply it in the context of certifying adversarial robustness of DNNs. In this context, we first show a new attack-agnostic measure of robustness which offers an alternative to purely attack-based methodology of evaluating robustness being reported today. Second, PROVERO provides certificates of robustness for large DNNs, where existing state-of-the-art verification tools fail to produce conclusive results. Our work paves the way forward for verifying properties of distributions captured by real-world deep neural networks, with provable guarantees, even where testers only have black-box access to the neural network.
Recurrent neural networks (RNNs) such as Long Short Term Memory (LSTM) networks have become popular in a variety of applications such as image processing, data classification, speech recognition, and as controllers in autonomous systems. In practical settings, there is often a need to deploy such RNNs on resource-constrained platforms such as mobile phones or embedded devices. As the memory footprint and energy consumption of such components become a bottleneck, there is interest in compressing and optimizing such networks using a range of heuristic techniques. However, these techniques do not guarantee the safety of the optimized network, e.g., against adversarial inputs, or equivalence of the optimized and original networks. To address this problem, we propose DIFFRNN, the first differential verification method for RNNs to certify the equivalence of two structurally similar neural networks. Existing work on differential verification for ReLUbased feed-forward neural networks does not apply to RNNs where nonlinear activation functions such as Sigmoid and Tanh cannot be avoided. RNNs also pose unique challenges such as handling sequential inputs, complex feedback structures, and interactions between the gates and states. In DIFFRNN, we overcome these challenges by bounding nonlinear activation functions with linear constraints and then solving constrained optimization problems to compute tight bounding boxes on nonlinear surfaces in a high-dimensional space. The soundness of these bounding boxes is then proved using the dReal SMT solver. We demonstrate the practical efficacy of our technique on a variety of benchmarks and show that DIFFRNN outperforms state-of-the-art RNN verification tools such as POPQORN.
Formal verification of neural networks is essential for their deployment in safety-critical areas. Many available formal verification methods have been shown to be instances of a unified Branch and Bound (BaB) formulation. We propose a novel framewor k for designing an effective branching strategy for BaB. Specifically, we learn a graph neural network (GNN) to imitate the strong branching heuristic behaviour. Our framework differs from previous methods for learning to branch in two main aspects. Firstly, our framework directly treats the neural network we want to verify as a graph input for the GNN. Secondly, we develop an intuitive forward and backward embedding update schedule. Empirically, our framework achieves roughly $50%$ reduction in both the number of branches and the time required for verification on various convolutional networks when compared to the best available hand-designed branching strategy. In addition, we show that our GNN model enjoys both horizontal and vertical transferability. Horizontally, the model trained on easy properties performs well on properties of increased difficulty levels. Vertically, the model trained on small neural networks achieves similar performance on large neural networks.
Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high frequency ones. As realistic training sets are not drawn from a uniform distribution, we here use the Neural Tangent Kernel (NTK) model to explore the effect of variable density on training dynamics. Our results, which combine analytic and empirical observations, show that when learning a pure harmonic function of frequency $kappa$, convergence at a point $x in Sphere^{d-1}$ occurs in time $O(kappa^d/p(x))$ where $p(x)$ denotes the local density at $x$. Specifically, for data in $Sphere^1$ we analytically derive the eigenfunctions of the kernel associated with the NTK for two-layer networks. We further prove convergence results for deep, fully connected networks with respect to the spectral decomposition of the NTK. Our empirical study highlights similarities and differences between deep and shallow networks in this model.
106 - Beomseok Seo , Lin Lin , 2021
Deep neural network (DNN) models have achieved phenomenal success for applications in many domains, ranging from academic research in science and engineering to industry and business. The modeling power of DNN is believed to have come from the comple xity and over-parameterization of the model, which on the other hand has been criticized for the lack of interpretation. Although certainly not true for every application, in some applications, especially in economics, social science, healthcare industry, and administrative decision making, scientists or practitioners are resistant to use predictions made by a black-box system for multiple reasons. One reason is that a major purpose of a study can be to make discoveries based upon the prediction function, e.g., to reveal the relationships between measurements. Another reason can be that the training dataset is not large enough to make researchers feel completely sure about a purely data-driven result. Being able to examine and interpret the prediction function will enable researchers to connect the result with existing knowledge or gain insights about new directions to explore. Although classic statistical models are much more explainable, their accuracy often falls considerably below DNN. In this paper, we propose an approach to fill the gap between relatively simple explainable models and DNN such that we can more flexibly tune the trade-off between interpretability and accuracy. Our main idea is a mixture of discriminative models that is trained with the guidance from a DNN. Although mixtures of discriminative models have been studied before, our way of generating the mixture is quite different.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا