ﻻ يوجد ملخص باللغة العربية
Neural networks have succeeded in many reasoning tasks. Empirically, these tasks require specialized network structures, e.g., Graph Neural Networks (GNNs) perform well on many such tasks, but less structured networks fail. Theoretically, there is limited understanding of why and when a network structure generalizes better than others, although they have equal expressive power. In this paper, we develop a framework to characterize which reasoning tasks a network can learn well, by studying how well its computation structure aligns with the algorithmic structure of the relevant reasoning process. We formally define this algorithmic alignment and derive a sample complexity bound that decreases with better alignment. This framework offers an explanation for the empirical success of popular reasoning models, and suggests their limitations. As an example, we unify seemingly different reasoning tasks, such as intuitive physics, visual question answering, and shortest paths, via the lens of a powerful algorithmic paradigm, dynamic programming (DP). We show that GNNs align with DP and thus are expected to solve these tasks. On several reasoning tasks, our theory is supported by empirical results.
Filter is the key component in modern convolutional neural networks (CNNs). However, since CNNs are usually over-parameterized, a pre-trained network always contain some invalid (unimportant) filters. These filters have relatively small $l_{1}$ norm
We study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of the training distribution. Previous works report mixed empirical results when extrapolating with neural networks: while feedforward neu
To make advanced learning machines such as Deep Neural Networks (DNNs) more transparent in decision making, explainable AI (XAI) aims to provide interpretations of DNNs predictions. These interpretations are usually given in the form of heatmaps, eac
Traditional deep neural networks (NNs) have significantly contributed to the state-of-the-art performance in the task of classification under various application domains. However, NNs have not considered inherent uncertainty in data associated with t
While the advent of Graph Neural Networks (GNNs) has greatly improved node and graph representation learning in many applications, the neighborhood aggregation scheme exposes additional vulnerabilities to adversaries seeking to extract node-level inf