ﻻ يوجد ملخص باللغة العربية
We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can even outperform the standard transformer in terms of accuracy by a significant margin. For instance, FMMformers achieve an average classification accuracy of $60.74%$ over the five Long Range Arena tasks, which is significantly better than the standard transformers average accuracy of $58.70%$.
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems on heterogeneous diffusion networks. Our new framework leverages the Mori-Zwanzig formalism to obtain an exact evolution equation of th
Far-field directional scattering and near-field directional coupling from simple sources have recently received great attention in photonics: beyond circularly-polarized dipoles, whose directional coupling to evanescent waves was recently applied to
We describe an efficient near-field to far-field transformation for optical quasinormal modes, which are the dissipative modes of open cavities and plasmonic resonators with complex eigenfrequencies. As an application of the theory, we show how one c
Federated Learning (FL) enables the multiple participating devices to collaboratively contribute to a global neural network model while keeping the training data locally. Unlike the centralized training setting, the non-IID and imbalanced (statistica
Deep neural networks have been shown as a class of useful tools for addressing signal recognition issues in recent years, especially for identifying the nonlinear feature structures of signals. However, this power of most deep learning techniques hea