Theoretical Investigation of Composite Neural Network

65 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ming-Chuan Yang

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Ming-Chuan Yang - Meng Chang Chen

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This work theoretically investigates the performance of a composite neural network. A composite neural network is a rooted directed acyclic graph combining a set of pre-trained and non-instantiated neural network models, where a pre-trained neural network model is well-crafted for a specific task and targeted to approximate a specific function with instantiated weights. The advantages of adopting such a pre-trained model in a composite neural network are two folds. One is to benefit from others intelligence and diligence, and the other is saving the efforts in data preparation and resources and time in training. However, the overall performance of composite neural network is still not clear. In this work, we prove that a composite neural network, with high probability, performs better than any of its pre-trained components under certain assumptions. In addition, if an extra pre-trained component is added to a composite network, with high probability the overall performance will be improved. In the empirical evaluations, distinctively different applications support the above findings.

قيم البحث

69 - Ming-Chuan Yang , Meng Chang Chen 2019

This work investigates the framework and performance issues of the composite neural network, which is composed of a collection of pre-trained and non-instantiated neural network models connected as a rooted directed acyclic graph for solving complica ted applications. A pre-trained neural network model is generally well trained, targeted to approximate a specific function. Despite a general belief that a composite neural network may perform better than a single component, the overall performance characteristics are not clear. In this work, we construct the framework of a composite network, and prove that a composite neural network performs better than any of its pre-trained components with a high probability bound. In addition, if an extra pre-trained component is added to a composite network, with high probability, the overall performance will not be degraded. In the study, we explore a complicated application -- PM2.5 prediction -- to illustrate the correctness of the proposed composite network theory. In the empirical evaluations of PM2.5 prediction, the constructed composite neural network models support the proposed theory and perform better than other machine learning models, demonstrate the advantages of the proposed framework.

التعلم الآلي التعلم الالي

Neural Network Branching for Neural Network Verification

97 - Jingyue Lu , M. Pawan Kumar 2019

Formal verification of neural networks is essential for their deployment in safety-critical areas. Many available formal verification methods have been shown to be instances of a unified Branch and Bound (BaB) formulation. We propose a novel framewor k for designing an effective branching strategy for BaB. Specifically, we learn a graph neural network (GNN) to imitate the strong branching heuristic behaviour. Our framework differs from previous methods for learning to branch in two main aspects. Firstly, our framework directly treats the neural network we want to verify as a graph input for the GNN. Secondly, we develop an intuitive forward and backward embedding update schedule. Empirically, our framework achieves roughly $50%$ reduction in both the number of branches and the time required for verification on various convolutional networks when compared to the best available hand-designed branching strategy. In addition, we show that our GNN model enjoys both horizontal and vertical transferability. Horizontally, the model trained on easy properties performs well on properties of increased difficulty levels. Vertically, the model trained on small neural networks achieves similar performance on large neural networks.

التعلم الآلي التعلم الالي

Graph Wavelet Neural Network

163 - Bingbing Xu , Huawei Shen , Qi Cao 2019

We present graph wavelet neural network (GWNN), a novel graph convolutional neural network (CNN), leveraging graph wavelet transform to address the shortcomings of previous spectral graph CNN methods that depend on graph Fourier transform. Different from graph Fourier transform, graph wavelet transform can be obtained via a fast algorithm without requiring matrix eigendecomposition with high computational cost. Moreover, graph wavelets are sparse and localized in vertex domain, offering high efficiency and good interpretability for graph convolution. The proposed GWNN significantly outperforms previous spectral graph CNNs in the task of graph-based semi-supervised classification on three benchmark datasets: Cora, Citeseer and Pubmed.

التعلم الآلي التعلم الالي

Neural Network Memorization Dissection

97 - Jindong Gu , Volker Tresp 2019

Deep neural networks (DNNs) can easily fit a random labeling of the training data with zero training error. What is the difference between DNNs trained with random labels and the ones trained with true labels? Our paper answers this question with two contributions. First, we study the memorization properties of DNNs. Our empirical experiments shed light on how DNNs prioritize the learning of simple input patterns. In the second part, we propose to measure the similarity between what different DNNs have learned and memorized. With the proposed approach, we analyze and compare DNNs trained on data with true labels and random labels. The analysis shows that DNNs have textit{One way to Learn} and textit{N ways to Memorize}. We also use gradient information to gain an understanding of the analysis results.

التعلم الآلي التعلم الالي

Factor Graph Neural Network

91 - Zhen Zhang , Fan Wu , Wee Sun Lee 2019

Most of the successful deep neural network architectures are structured, often consisting of elements like convolutional neural networks and gated recurrent neural networks. Recently, graph neural networks have been successfully applied to graph stru ctured data such as point cloud and molecular data. These networks often only consider pairwise dependencies, as they operate on a graph structure. We generalize the graph neural network into a factor graph neural network (FGNN) in order to capture higher order dependencies. We show that FGNN is able to represent Max-Product Belief Propagation, an approximate inference algorithm on probabilistic graphical models; hence it is able to do well when Max-Product does well. Promising results on both synthetic and real datasets demonstrate the effectiveness of the proposed model.

التعلم الآلي التعلم الالي