ترغب بنشر مسار تعليمي؟ اضغط هنا

We propose a new kind of geometric effective theory based on curved space-time single valley Dirac theory with spin connection for twisted bilayer graphene under generic twist angle. This model can reproduce the nearly flat bands with particle-hole s ymmetry around the first magic angle. The band width is near the former results given by Bistritzer-MacDonald model or density matrix renormalization group. Even more, such geometric formalism allows one to predict the properties of rotating bilayer graphene which cannot be accessed by former theories. As an example, we investigate the Bott index of a rotating bilayer graphene. We relate this to the two-dimensional Thouless pump with quantized charge pumping during one driving period which could be verified by transport measurement.
We use numerical relativity to study the merger and ringdown stages of superkick binary black hole systems (those with equal mass and anti-parallel spins). We find a universal way to describe the mass and current quadrupole gravitational waves emitte d by these systems during the merger and ringdown stage: (i) The time evolutions of these waves are insensitive to the progenitors parameters (spins) after being normalized by their own peak values. (ii) The peak values, which encode all the spin information of the progenitor, can be consistently fitted to formulas inspired by post-Newtonian theory. We find that the universal evolution of the mass quadrupole wave can be accurately modeled by the so-called Backwards One-Body (BOB) model. However, the BOB model, in its present form, leads to a lower waveform match and a significant parameter-estimation bias for the current quadrupole wave. We also decompose the ringdown signal into seven overtones, and study the dependence of mode amplitudes on the progenitors parameters. Such dependence is found to be insensitive to the overtone index (up to a scaling factor). Finally, we use the Fisher matrix technique to investigate how the ringdown waveform can be at least as important for parameter estimation as the inspiral stage. Assuming the Cosmic Explorer, we find the contribution of ringdown portion dominates as the total mass exceeds ~ 250 solar mass. For massive BBH systems, the accuracy of parameter measurement is improved by incorporating the information of ringdown -- the ringdown sector gives rise to a different parameter correlation from inspiral stage, hence the overall parameter correlation is reduced in the full signal.
113 - Tao Luo , Zheng Ma , Zhiwei Wang 2021
Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs -- difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, our theory, confirmed by numerical experiments, suggests that there is a critical decaying rate w.r.t. frequency in DNN training. Below the upper limit of the decaying rate, the DNN interpolates the training data by a function with a certain regularity. However, above the upper limit, the DNN interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. Our results indicate a better way to overcome the high-frequency curse is to design a proper pre-condition approach to shift high-frequency information to low-frequency one, which coincides with several previous developed algorithms for fast learning high-frequency information. More importantly, this work rigorously proves that the high-frequency curse is an intrinsic difficulty of DNNs.
355 - Yaoyu Zhang , Tao Luo , Zheng Ma 2021
Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle (LFP) m odel accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore, through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to a LFP model with quantitative prediction power.
64 - Tao Luo , Zheng Ma , Zhiwei Wang 2020
A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points. Inspired by the frequency principle in neural networks, we propose a Fourier-domain variational formulation for supervised learni ng problem. This formulation circumvents the difficulty of imposing the constraints of given values on isolated data points in continuum modelling. Under a necessary and sufficient condition within our unified framework, we establish the well-posedness of the Fourier-domain variational problem, by showing a critical exponent depending on the data dimension. In practice, a neural network can be a convenient way to implement our formulation, which automatically satisfies the well-posedness condition.
In their seminal paper on scattering by an inhomogeneous solid, Debye and coworkers proposed a simple exponentially decaying function for the two-point correlation function of an idealized class of two-phase random media. Such {it Debye random media} , which have been shown to be realizable, are singularly distinct from all other models of two-phase media in that they are entirely defined by their one- and two-point correlation functions. To our knowledge, there has been no determination of other microstructural descriptors of Debye random media. In this paper, we generate Debye random media in two dimensions using an accelerated Yeong-Torquato construction algorithm. We then ascertain microstructural descriptors of the constructed media, including their surface correlation functions, pore-size distributions, lineal-path function, and chord-length probability density function. Accurate semi-analytic and empirical formulas for these descriptors are devised. We compare our results for Debye random media to those of other popular models (overlapping disks and equilibrium hard disks), and find that the former model possesses a wider spectrum of hole sizes, including a substantial fraction of large holes. Our algorithm can be applied to generate other models defined by their two-point correlation functions, and their other microstructural descriptors can be determined and analyzed by the procedures laid out here.
Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behav ior of DNNs in complex tasks. In this paper, through analysis of an infinite-width two-layer NN in the neural tangent kernel (NTK) regime, we derive the exact differential equation, namely Linear Frequency-Principle (LFP) model, governing the evolution of NN output function in the frequency domain during the training. Our exact computation applies for general activation functions with no assumption on size and distribution of training data. This LFP model unravels that higher frequencies evolve polynomially or exponentially slower than lower frequencies depending on the smoothness/regularity of the activation function. We further bridge the gap between training dynamics and generalization by proving that LFP model implicitly minimizes a Frequency-Principle norm (FP-norm) of the learned function, by which higher frequencies are more severely penalized depending on the inverse of their evolution rate. Finally, we derive an textit{a priori} generalization error bound controlled by the FP-norm of the target function, which provides a theoretical justification for the empirical results that DNNs often generalize well for low frequency functions.
Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path i ntegral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. It generalizes the graph Laplacian to a new transition matrix we call maximal entropy transition (MET) matrix derived from a path integral formalism. Importantly, the diagonal entries of the MET matrix are directly related to the subgraph centrality, thus providing a natural and adaptive pooling mechanism. PAN provides a versatile framework that can be tailored for different graph data with varying sizes and structures. We can view most existing GNN architectures as special cases of PAN. Experimental results show that PAN achieves state-of-the-art performance on various graph classification/regression tasks, including a new benchmark dataset from statistical mechanics we propose to boost applications of GNN in physical sciences.
124 - Yu Guang Wang , Ming Li , Zheng Ma 2019
Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks. In these tasks, graph pooling is a critical ingredient by which GNNs adapt to input graphs of varying size and structure. We propose a new graph pooling operation based on compressive Haar transforms -- HaarPooling. HaarPooling implements a cascade of pooling operations; it is computed by following a sequence of clusterings of the input graph. A HaarPooling layer transforms a given input graph to an output graph with a smaller node number and the same feature dimension; the compressive Haar transform filters out fine detail information in the Haar wavelet domain. In this way, all the HaarPooling layers together synthesize the features of any given input graph into a feature vector of uniform size. Such transforms provide a sparse characterization of the data and preserve the structure information of the input graph. GNNs implemented with standard graph convolution layers and HaarPooling layers achieve state of the art performance on diverse graph classification and regression problems.
We use the strong intrinsic non-linearity of a microwave superconducting qubit with a 4 GHz transition frequency to directly detect and control the energy of a micro-mechanical oscillator vibrating at 25 MHz. The qubit and the oscillator are coupled electrostatically at a rate of approximately $2pitimes$22 MHz. In this far off-resonant regime, the qubit frequency is shifted by 0.52 MHz per oscillator phonon, or about 14 % of the 3.7 MHz qubit linewidth. The qubit behaves as a vibrational energy detector and from its lineshape we extract the phonon number distribution of the oscillator. We manipulate this distribution by driving number state sensitive sideband transitions and creating profoundly non-thermal states. Finally, by driving the lower frequency sideband transition, we cool the oscillator and increase its ground state population up to 0.48$pm$0.13, close to a factor of 8 above its value at thermal equilibrium. These results demonstrate a new class of electromechanics experiments that are a promising strategy for quantum non-demolition measurements and non-classical state preparation.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا