No Arabic abstract
Realization of deep learning with coherent diffraction has achieved remarkable development nowadays, which benefits on the fact that matrix multiplication can be optically executed in parallel as well as with little power consumption. Coherent optical field propagated in the form of complex-value entity can be manipulated into a task-oriented output with statistical inference. In this paper, we present a unitary learning protocol on deep diffractive neural network, meeting the physical unitary prior in coherent diffraction. Unitary learning is a backpropagation serving to unitary weights update through the gradient translation between Euclidean and Riemannian space. The temporal-space evolution characteristic in unitary learning is formulated and elucidated. Particularly a compatible condition on how to select the nonlinear activations in complex space is unveiled, encapsulating the fundamental sigmoid, tanh and quasi-ReLu in complex space. As a preliminary application, deep diffractive neural network with unitary learning is tentatively implemented on the 2D classification and verification tasks.
We introduce an all-optical Diffractive Deep Neural Network (D2NN) architecture that can learn to implement various functions after deep learning-based design of passive diffractive layers that work collectively. We experimentally demonstrated the success of this framework by creating 3D-printed D2NNs that learned to implement handwritten digit classification and the function of an imaging lens at terahertz spectrum. With the existing plethora of 3D-printing and other lithographic fabrication methods as well as spatial-light-modulators, this all-optical deep learning framework can perform, at the speed of light, various complex functions that computer-based neural networks can implement, and will find applications in all-optical image analysis, feature detection and object classification, also enabling new camera designs and optical components that can learn to perform unique tasks using D2NNs.
This paper introduces two recurrent neural network structures called Simple Gated Unit (SGU) and Deep Simple Gated Unit (DSGU), which are general structures for learning long term dependencies. Compared to traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gates to control information flow in the network, SGU and DSGU only use one multiplicative gate to control the flow of information. We show that this difference can accelerate the learning speed in tasks that require long dependency information. We also show that DSGU is more numerically stable than SGU. In addition, we also propose a standard way of representing inner structure of RNN called RNN Conventional Graph (RCG), which helps analyzing the relationship between input units and hidden units of RNN.
Graph neural networks (GNNs) are naturally distributed architectures for learning representations from network data. This renders them suitable candidates for decentralized tasks. In these scenarios, the underlying graph often changes with time due to link failures or topology variations, creating a mismatch between the graphs on which GNNs were trained and the ones on which they are tested. Online learning can be leveraged to retrain GNNs at testing time to overcome this issue. However, most online algorithms are centralized and usually offer guarantees only on convex problems, which GNNs rarely lead to. This paper develops the Wide and Deep GNN (WD-GNN), a novel architecture that can be updated with distributed online learning mechanisms. The WD-GNN consists of two components: the wide part is a linear graph filter and the deep part is a nonlinear GNN. At training time, the joint wide and deep architecture learns nonlinear representations from data. At testing time, the wide, linear part is retrained, while the deep, nonlinear one remains fixed. This often leads to a convex formulation. We further propose a distributed online learning algorithm that can be implemented in a decentralized setting. We also show the stability of the WD-GNN to changes of the underlying graph and analyze the convergence of the proposed online learning procedure. Experiments on movie recommendation, source localization and robot swarm control corroborate theoretical findings and show the potential of the WD-GNN for distributed online learning.
Approaches to machine intelligence based on brain models have stressed the use of neural networks for generalization. Here we propose the use of a hybrid neural network architecture that uses two kind of neural networks simultaneously: (i) a surface learning agent that quickly adapt to new modes of operation; and, (ii) a deep learning agent that is very accurate within a specific regime of operation. The two networks of the hybrid architecture perform complementary functions that improve the overall performance. The performance of the hybrid architecture has been compared with that of back-propagation perceptrons and the CC and FC networks for chaotic time-series prediction, the CATS benchmark test, and smooth function approximation. It has been shown that the hybrid architecture provides a superior performance based on the RMS error criterion.
Unitary learning is a backpropagation that serves to unitary weights update in deep complex-valued neural network with full connections, meeting a physical unitary prior in diffractive deep neural network ([DN]2). However, the square matrix property of unitary weights induces that the function signal has a limited dimension that could not generalize well. To address the overfitting problem that comes from the small samples loaded to [DN]2, an optical phase dropout trick is implemented. Phase dropout in unitary space that is evolved from a complex dropout and has a statistical inference is formulated for the first time. A synthetic mask recreated from random point apertures with random phase-shifting and its smothered modulation tailors the redundant links through incompletely sampling the input optical field at each diffractive layer. The physical features about the synthetic mask using different nonlinear activations are elucidated in detail. The equivalence between digital and diffractive model determines compound modulations that could successfully circumvent the nonlinear activations physically implemented in [DN]2. The numerical experiments verify the superiority of optical phase dropout in [DN]2 to enhance accuracy in 2D classification and recognition tasks-oriented.