ترغب بنشر مسار تعليمي؟ اضغط هنا

Fastest learning in small world neural networks

132   0   0.0 ( 0 )
 نشر من قبل Helmut Kroger
 تاريخ النشر 2004
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We investigate supervised learning in neural networks. We consider a multi-layered feed-forward network with back propagation. We find that the network of small-world connectivity reduces the learning error and learning time when compared to the networks of regular or random connectivity. Our study has potential applications in the domain of data-mining, image processing, speech recognition, and pattern recognition.



قيم البحث

اقرأ أيضاً

We study the effective resistance of small-world resistor networks. Utilizing recent analytic results for the propagator of the Edwards-Wilkinson process on small-world networks, we obtain the asymptotic behavior of the disorder-averaged two-point re sistance in the large system-size limit. We find that the small-world structure suppresses large network resistances: both the average resistance and its standard deviation approaches a finite value in the large system-size limit for any non-zero density of random links. We also consider a scenario where the link conductance decays as a power of the length of the random links, $l^{-alpha}$. In this case we find that the average effective system resistance diverges for any non-zero value of $alpha$.
171 - Greg Yang , Edward J. Hu 2020
As its width tends to infinity, a deep neural networks behavior under gradient descent can become simplified and predictable (e.g. given by the Neural Tangent Kernel (NTK)), if it is parametrized appropriately (e.g. the NTK parametrization). However, we show that the standard and NTK parametrizations of a neural network do not admit infinite-width limits that can learn features, which is crucial for pretraining and transfer learning such as with BERT. We propose simple modifications to the standard parametrization to allow for feature learning in the limit. Using the *Tensor Programs* technique, we derive explicit formulas for such limits. On Word2Vec and few-shot learning on Omniglot via MAML, two canonical tasks that rely crucially on feature learning, we compute these limits exactly. We find that they outperform both NTK baselines and finite-width networks, with the latter approaching the infinite-width feature learning performance as width increases. More generally, we classify a natural space of neural network parametrizations that generalizes standard, NTK, and Mean Field parametrizations. We show 1) any parametrization in this space either admits feature learning or has an infinite-width training dynamics given by kernel gradient descent, but not both; 2) any such infinite-width limit can be computed using the Tensor Programs technique. Code for our experiments can be found at github.com/edwardjhu/TP4.
The transition to turbulence via spatiotemporal intermittency is investigated in the context of coupled maps defined on small-world networks. The local dynamics is given by the Chate-Manneville minimal map previously used in studies of spatiotemporal intermittency in ordered lattices. The critical boundary separating laminar and turbulent regimes is calculated on the parameter space of the system, given by the coupling strength and the rewiring probability of the network. Windows of relaminarization are present in some regions of the parameter space. New features arise in small-world networks; for instance, the character of the transition to turbulence changes from second order to a first order phase transition at some critical value of the rewiring probability. A linear relation characterizing the change in the order of the phase transition is found. The global quantity used as order parameter for the transition also exhibits nontrivial collective behavior for some values of the parameters. These models may describe several processes occurring in nonuniform media where the degree of disorder can be continuously varied through a parameter.
Graphical models are widely used in science to represent joint probability distributions with an underlying conditional dependence structure. The inverse problem of learning a discrete graphical model given i.i.d samples from its joint distribution c an be solved with near-optimal sample complexity using a convex optimization method known as Generalized Regularized Interaction Screening Estimator (GRISE). But the computational cost of GRISE becomes prohibitive when the energy function of the true graphical model has higher-order terms. We introduce NeurISE, a neural net based algorithm for graphical model learning, to tackle this limitation of GRISE. We use neural nets as function approximators in an Interaction Screening objective function. The optimization of this objective then produces a neural-net representation for the conditionals of the graphical model. NeurISE algorithm is seen to be a better alternative to GRISE when the energy function of the true model has a high order with a high degree of symmetry. In these cases NeurISE is able to find the correct parsimonious representation for the conditionals without being fed any prior information about the true model. NeurISE can also be used to learn the underlying structure of the true model with some simple modifications to its training procedure. In addition, we also show a variant of NeurISE that can be used to learn a neural net representation for the full energy function of the true model.
226 - D. Bolle , R. Heylen 2007
We study the thermodynamic properties of spin systems with bond-disorder on small-world hypergraphs, obtained by superimposing a one-dimensional Ising chain onto a random Bethe graph with p-spin interactions. Using transfer-matrix techniques, we deri ve fixed-point equations describing the relevant order parameters and the free energy, both in the replica symmetric and one step replica symmetry breaking approximation. We determine the static and dynamic ferromagnetic transition and the spinglass transition within replica symmetry for all temperatures, and demonstrate corrections to these results when one step replica symmetry breaking is taken into account. The results obtained are in agreement with Monte-Carlo simulations.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا