Into the Unknown: Active Monitoring of Neural Networks

166 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Christian Schilling

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Anna Lukina - Christian Schilling - Thomas A. Henzinger

التعلم الآلي الذكاء الاصطناعي المنطق في علوم الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Neural-network classifiers achieve high accuracy when predicting the class of an input that they were trained to identify. Maintaining this accuracy in dynamic environments, where inputs frequently fall outside the fixed set of initially known classes, remains a challenge. The typical approach is to detect inputs from novel classes and retrain the classifier on an augmented dataset. However, not only the classifier but also the detection mechanism needs to adapt in order to distinguish between newly learned and yet unknown input classes. To address this challenge, we introduce an algorithmic framework for active monitoring of a neural network. A monitor wrapped in our framework operates in parallel with the neural network and interacts with a human user via a series of interpretable labeling queries for incremental adaptation. In addition, we propose an adaptive quantitative monitor to improve precision. An experimental evaluation on a diverse set of benchmarks with varying numbers of classes confirms the benefits of our active monitoring framework in dynamic scenarios.

قيم البحث

394 - Thomas A. Henzinger , Anna Lukina , Christian Schilling 2019

Neural networks have demonstrated unmatched performance in a range of classification tasks. Despite numerous efforts of the research community, novelty detection remains one of the significant limitations of neural networks. The ability to identify p reviously unseen inputs as novel is crucial for our understanding of the decisions made by neural networks. At runtime, inputs not falling into any of the categories learned during training cannot be classified correctly by the neural network. Existing approaches treat the neural network as a black box and try to detect novel inputs based on the confidence of the output predictions. However, neural networks are not trained to reduce their confidence for novel inputs, which limits the effectiveness of these approaches. We propose a framework to monitor a neural network by observing the hidden layers. We employ a common abstraction from program analysis - boxes - to identify novel behaviors in the monitored layers, i.e., inputs that cause behaviors outside the box. For each neuron, the boxes range over the values seen in training. The framework is efficient and flexible to achieve a desired trade-off between raising false warnings and detecting novel inputs. We illustrate the performance and the robustness to variability in the unknown classes on popular image-classification benchmarks.

التعلم الآلي الذكاء الاصطناعي المنطق في علوم الحاسوب

BDD4BNN: A BDD-based Quantitative Analysis Framework for Binarized Neural Networks

75 - Yedi Zhang , Zhe Zhao , Guangke Chen 2021

Verifying and explaining the behavior of neural networks is becoming increasingly important, especially when they are deployed in safety-critical applications. In this paper, we study verification problems for Binarized Neural Networks (BNNs), the 1- bit quantization of general real-numbered neural networks. Our approach is to encode BNNs into Binary Decision Diagrams (BDDs), which is done by exploiting the internal structure of the BNNs. In particular, we translate the input-output relation of blocks in BNNs to cardinality constraints which are then encoded by BDDs. Based on the encoding, we develop a quantitative verification framework for BNNs where precise and comprehensive analysis of BNNs can be performed. We demonstrate the application of our framework by providing quantitative robustness analysis and interpretability for BNNs. We implement a prototype tool BDD4BNN and carry out extensive experiments which confirm the effectiveness and efficiency of our approach.

التعلم الآلي الذكاء الاصطناعي المنطق في علوم الحاسوب

Topological Insights into Sparse Neural Networks

159 - Shiwei Liu , Tim Van der Lee , Anil Yaman 2020

Sparse neural networks are effective approaches to reduce the resource requirements for the deployment of deep neural networks. Recently, the concept of adaptive sparse connectivity, has emerged to allow training sparse neural networks from scratch b y optimizing the sparse structure during training. However, comparing different sparse topologies and determining how sparse topologies evolve during training, especially for the situation in which the sparse structure optimization is involved, remain as challenging open questions. This comparison becomes increasingly complex as the number of possible topological comparisons increases exponentially with the size of networks. In this work, we introduce an approach to understand and compare sparse neural network topologies from the perspective of graph theory. We first propose Neural Network Sparse Topology Distance (NNSTD) to measure the distance between different sparse neural networks. Further, we demonstrate that sparse neural networks can outperform over-parameterized models in terms of performance, even without any further structure optimization. To the end, we also show that adaptive sparse connectivity can always unveil a plenitude of sparse sub-networks with very different topologies which outperform the dense model, by quantifying and comparing their topological evolutionary processes. The latter findings complement the Lottery Ticket Hypothesis by showing that there is a much more efficient and robust way to find winning tickets. Altogether, our results start enabling a better theoretical understanding of sparse neural networks, and demonstrate the utility of using graph theory to analyze them.

التعلم الآلي التعلم الالي

Some Insights into the Geometry and Training of Neural Networks

64 - Ewout van den Berg 2016

Neural networks have been successfully used for classification tasks in a rapidly growing number of practical applications. Despite their popularity and widespread use, there are still many aspects of training and classification that are not well und erstood. In this paper we aim to provide some new insights into training and classification by analyzing neural networks from a feature-space perspective. We review and explain the formation of decision regions and study some of their combinatorial aspects. We place a particular emphasis on the connections between the neural network weight and bias terms and properties of decision boundaries and other regions that exhibit varying levels of classification confidence. We show how the error backpropagates in these regions and emphasize the important role they have in the formation of gradients. These findings expose the connections between scaling of the weight parameters and the density of the training samples. This sheds more light on the vanishing gradient problem, explains the need for regularization, and suggests an approach for subsampling training data to improve performance.

التعلم الآلي

Increasing Trustworthiness of Deep Neural Networks via Accuracy Monitoring

72 - Zhihui Shao , , Jianyi Yang 2020

Inference accuracy of deep neural networks (DNNs) is a crucial performance metric, but can vary greatly in practice subject to actual test datasets and is typically unknown due to the lack of ground truth labels. This has raised significant concerns with trustworthiness of DNNs, especially in safety-critical applications. In this paper, we address trustworthiness of DNNs by using post-hoc processing to monitor the true inference accuracy on a users dataset. Concretely, we propose a neural network-based accuracy monitor model, which only takes the deployed DNNs softmax probability output as its input and directly predicts if the DNNs prediction result is correct or not, thus leading to an estimate of the true inference accuracy. The accuracy monitor model can be pre-trained on a dataset relevant to the target application of interest, and only needs to actively label a small portion (1% in our experiments) of the users dataset for model transfer. For estimation robustness, we further employ an ensemble of monitor models based on the Monte-Carlo dropout method. We evaluate our approach on different deployed DNN models for image classification and traffic sign detection over multiple datasets (including adversarial samples). The result shows that our accuracy monitor model provides a close-to-true accuracy estimation and outperforms the existing baseline methods.

التعلم الآلي التعلم الالي