Dynamics of feed forward induced interference training


Abstract in English

Preceptron model updating with back propagation has become the routine of deep learning. Continuous feed forward procedure is required in order for backward propagate to function properly. Doubting the underlying physical interpretation on transformer based models such as GPT brought about by the routine explaination, a new method of training is proposed in order to keep self-consistency of the physics. By treating the GPT model as a space-time diagram, and then trace the worldlines of signals, identifing the possible paths of signals in order fot a self-attention event to occure. With a slight modification, self-attention can be viewed as an ising model interaction, which enables the goal to be designed as energy of system. Target is treated as an external magnetic field inducing signals modeled as magnetic dipoles. A probability network is designed to pilot input signals travelling for different durations through different routes. A rule of updating the probabilities is designed in order to form constructive interference at target locations so that instantaneous energy can be maximised. Experiment was conducted on a 4-class classification problem extracted from MNIST. The results exhibit interesting but expected behavours, which do not exist in a bp updated network, but more like learning in a real human, especially in the few-shot scenario.

Download