ﻻ يوجد ملخص باللغة العربية
In recent years, Graph Neural Networks (GNNs) appear to be state-of-the-art algorithms for analyzing non-euclidean graph data. By applying deep-learning to extract high-level representations from graph structures, GNNs achieve extraordinary accuracy and great generalization ability in various tasks. However, with the ever-increasing graph sizes, more and more complicated GNN layers, and higher feature dimensions, the computational complexity of GNNs grows exponentially. How to inference GNNs in real time has become a challenging problem, especially for some resource-limited edge-computing platforms. To tackle this challenge, we propose BlockGNN, a software-hardware co-design approach to realize efficient GNN acceleration. At the algorithm level, we propose to leverage block-circulant weight matrices to greatly reduce the complexity of various GNN models. At the hardware design level, we propose a pipelined CirCore architecture, which supports efficient block-circulant matrices computation. Basing on CirCore, we present a novel BlockGNN accelerator to compute various GNNs with low latency. Moreover, to determine the optimal configurations for diverse deployed tasks, we also introduce a performance and resource model that helps choose the optimal hardware parameters automatically. Comprehensive experiments on the ZC706 FPGA platform demonstrate that on various GNN tasks, BlockGNN achieves up to $8.3times$ speedup compared to the baseline HyGCN architecture and $111.9times$ energy reduction compared to the Intel Xeon CPU platform.
In this paper, we construct self-dual codes from a construction that involves both block circulant matrices and block quadratic residue circulant matrices. We provide conditions when this construction can yield self-dual codes. We construct self-dual
The high computation and memory storage of large deep neural networks (DNNs) models pose intensive challenges to the conventional Von-Neumann architecture, incurring substantial data movements in the memory hierarchy. The memristor crossbar array has
Given an ensemble of NxN random matrices, a natural question to ask is whether or not the empirical spectral measures of typical matrices converge to a limiting spectral measure as N --> oo. While this has been proved for many thin patterned ensemble
If $q = p^n$ is a prime power, then a $d$-dimensional emph{$q$-Butson Hadamard matrix} $H$ is a $dtimes d$ matrix with all entries $q$th roots of unity such that $HH^* = dI_d$. We use algebraic number theory to prove a strong constraint on the dimens
As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). However, the existing one-size-fits-all GNN implementations are insufficient to ca