No Arabic abstract
Spatial Semantic Pointers (SSPs) have recently emerged as a powerful tool for representing and transforming continuous space, with numerous applications to cognitive modelling and deep learning. Fundamental to SSPs is the notion of similarity between vectors representing different points in $n$-dimensional space -- typically the dot product or cosine similarity between vectors with rotated unit-length complex coefficients in the Fourier domain. The similarity measure has previously been conjectured to be a Gaussian function of Euclidean distance. Contrary to this conjecture, we derive a simple trigonometric formula relating spatial displacement to similarity, and prove that, in the case where the Fourier coefficients are uniform i.i.d., the expected similarity is a product of normalized sinc functions: $prod_{k=1}^{n} operatorname{sinc} left( a_k right)$, where $mathbf{a} in mathbb{R}^n$ is the spatial displacement between the two $n$-dimensional points. This establishes a direct link between space and the similarity of SSPs, which in turn helps bolster a useful mathematical framework for architecting neural networks that manipulate spatial structures.
Characterizing in a constructive way the set of real functions whose Fourier transforms are positive appears to be yet an open problem. Some sufficient conditions are known but they are far from being exhaustive. We propose two constructive sets of necessary conditions for positivity of the Fourier transforms and test their ability of constraining the positivity domain. One uses analytic continuation and Jensen inequalities and the other deals with Toeplitz determinants and the Bochner theorem. Applications are discussed, including the extension to the two-dimensional Fourier-Bessel transform and the problem of positive reciprocity, i.e. positive functions with positive transforms.
In this work we verify the sufficiency of a Jensens necessary and sufficient condition for a class of genus 0 or 1 entire functions to have only real zeros. They are Fourier transforms of even, positive, indefinitely differentiable, and very fast decreasing functions. We also apply our result to several important special functions in mathematics, such as modified Bessel function $K_{iz}(a), a>0$ as a function of variable $z$, Riemann Xi function $Xi(z)$, and character Xi function $Xi(z;chi)$ when $chi$ is a real primitive non-principal character satisfying $varphi(u;chi)ge0$ on the real line, we prove these entire functions have only real zeros.
We show that Transformer encoder architectures can be sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that mix input tokens. These linear mixers, along with standard nonlinearities in feed-forward layers, prove competent at modeling semantic relationships in several text classification tasks. Most surprisingly, we find that replacing the self-attention sublayer in a Transformer encoder with a standard, unparameterized Fourier Transform achieves 92-97% of the accuracy of BERT counterparts on the GLUE benchmark, but trains 80% faster on GPUs and 70% faster on TPUs at standard 512 input lengths. At longer input lengths, our FNet model is significantly faster: when compared to the efficient Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finally, FNet has a light memory footprint and is particularly efficient at smaller model sizes; for a fixed speed and accuracy budget, small FNet models outperform Transformer counterparts.
We show that the adjunction counits of a Fourier-Mukai transform $Phi$ from $D(X_1)$ to $D(X_2)$ arise from maps of the kernels of the corresponding Fourier-Mukai transforms. In a very general setting of proper separable schemes of finite type over a field we write down these maps of kernels explicitly -- facilitating the computation of the twist (the cone of an adjunction counit) of $Phi$. We also give another description of these maps, better suited to computing cones if the kernel of $Phi$ is a pushforward from a closed subscheme $Z$ of $X_1 times X_2$. Moreover, we show that we can replace the condition of properness of the ambient spaces $X_1$ and $X_2$ by that of $Z$ being proper over them and still have this description apply as is. This can be used, for instance, to compute spherical twists on non-proper varieties directly and in full generality.
In the initial article [Phys. Rev. Lett. 110, 044301 (2013), arXiv:1208.4611] it was claimed that human hearing can beat the Fourier uncertainty principle. In this Comment, we demonstrate that the experiment designed and implemented in the original article was ill-chosen to test Fourier uncertainty in human hearing.