ترغب بنشر مسار تعليمي؟ اضغط هنا

The universal approximation theorem for complex-valued neural networks

105   0   0.0 ( 0 )
 نشر من قبل Felix Voigtlaender
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We generalize the classical universal approximation theorem for neural networks to the case of complex-valued neural networks. Precisely, we consider feedforward networks with a complex activation function $sigma : mathbb{C} to mathbb{C}$ in which each neuron performs the operation $mathbb{C}^N to mathbb{C}, z mapsto sigma(b + w^T z)$ with weights $w in mathbb{C}^N$ and a bias $b in mathbb{C}$, and with $sigma$ applied componentwise. We completely characterize those activation functions $sigma$ for which the associated complex networks have the universal approximation property, meaning that they can uniformly approximate any continuous function on any compact subset of $mathbb{C}^d$ arbitrarily well. Unlike the classical case of real networks, the set of good activation functions which give rise to networks with the universal approximation property differs significantly depending on whether one considers deep networks or shallow networks: For deep networks with at least two hidden layers, the universal approximation property holds as long as $sigma$ is neither a polynomial, a holomorphic function, or an antiholomorphic function. Shallow networks, on the other hand, are universal if and only if the real part or the imaginary part of $sigma$ is not a polyharmonic function.



قيم البحث

اقرأ أيضاً

We study the expressivity of deep neural networks. Measuring a networks complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complex ity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be endowed with a (quasi)-norm that makes it a linear function space, called approximation space. We establish that allowing the networks to have certain types of skip connections does not change the resulting approximation spaces. We also discuss the role of the networks nonlinearity (also known as activation function) on the resulting spaces, as well as the role of depth. For the popular ReLU nonlinearity and its powers, we relate the newly constructed spaces to classical Besov spaces. The established embeddings highlight that some functions of very low Besov smoothness can nevertheless be well approximated by neural networks, if these networks are sufficiently deep.
We prove two universal approximation theorems for a range of dropout neural networks. These are feed-forward neural networks in which each edge is given a random ${0,1}$-valued filter, that have two modes of operation: in the first each edge output i s multiplied by its random filter, resulting in a random output, while in the second each edge output is multiplied by the expectation of its filter, leading to a deterministic output. It is common to use the random mode during training and the deterministic mode during testing and prediction. Both theorems are of the following form: Given a function to approximate and a threshold $varepsilon>0$, there exists a dropout network that is $varepsilon$-close in probability and in $L^q$. The first theorem applies to dropout networks in the random mode. It assumes little on the activation function, applies to a wide class of networks, and can even be applied to approximation schemes other than neural networks. The core is an algebraic property that shows that deterministic networks can be exactly matched in expectation by random networks. The second theorem makes stronger assumptions and gives a stronger result. Given a function to approximate, it provides existence of a network that approximates in both modes simultaneously. Proof components are a recursive replacement of edges by independent copies, and a special first-layer replacement that couples the resulting larger network to the input. The functions to be approximated are assumed to be elements of general normed spaces, and the approximations are measured in the corresponding norms. The networks are constructed explicitly. Because of the different methods of proof, the two results give independent insight into the approximation properties of random dropout networks. With this, we establish that dropout neural networks broadly satisfy a universal-approximation property.
119 - Angela Capel , Jesus Ocariz 2020
This paper concerns the universal approximation property with neural networks in variable Lebesgue spaces. We show that, whenever the exponent function of the space is bounded, every function can be approximated with shallow neural networks with any desired accuracy. This result subsequently leads to determine the universality of the approximation depending on the boundedness of the exponent function. Furthermore, whenever the exponent is unbounded, we obtain some characterization results for the subspace of functions that can be approximated.
We prove that for every Banach space $Y$, the Besov spaces of functions from the $n$-dimensional Euclidean space to $Y$ agree with suitable local approximation spaces with equivalent norms. In addition, we prove that the Sobolev spaces of type $q$ ar e continuously embedded in the Besov spaces of the same type if and only if $Y$ has martingale cotype $q$. We interpret this as an extension of earlier results of Xu (1998), and Martinez, Torrea and Xu (2006). These two results combined give the characterization that $Y$ admits an equivalent norm with modulus of convexity of power type $q$ if and only if weakly differentiable functions have good local approximations with polynomials.
Many real-world signal sources are complex-valued, having real and imaginary components. However, the vast majority of existing deep learning platforms and network architectures do not support the use of complex-valued data. MRI data is inherently co mplex-valued, so existing approaches discard the richer algebraic structure of the complex data. In this work, we investigate end-to-end complex-valued convolutional neural networks - specifically, for image reconstruction in lieu of two-channel real-valued networks. We apply this to magnetic resonance imaging reconstruction for the purpose of accelerating scan times and determine the performance of various promising complex-valued activation functions. We find that complex-valued CNNs with complex-valued convolutions provide superior reconstructions compared to real-valued convolutions with the same number of trainable parameters, over a variety of network architectures and datasets.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا