ترغب بنشر مسار تعليمي؟ اضغط هنا

113 - Tao Luo , Zheng Ma , Zhiwei Wang 2021
Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs -- difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, our theory, confirmed by numerical experiments, suggests that there is a critical decaying rate w.r.t. frequency in DNN training. Below the upper limit of the decaying rate, the DNN interpolates the training data by a function with a certain regularity. However, above the upper limit, the DNN interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. Our results indicate a better way to overcome the high-frequency curse is to design a proper pre-condition approach to shift high-frequency information to low-frequency one, which coincides with several previous developed algorithms for fast learning high-frequency information. More importantly, this work rigorously proves that the high-frequency curse is an intrinsic difficulty of DNNs.
In the present paper, we show that given a compact Kahler manifold $(X,omega)$ with a Kahler metric $omega$, and a complex submanifold $Vsubset X$ of positive dimension, if $V$ has a holomorphic retraction structure in $X$, then any quasi-plurisubhar monic function $varphi$ on $V$ such that $omega|_V+sqrt{-1}partialbarpartialvarphigeq varepsilonomega|_V$ with $varepsilon>0$ can be extended to a quasi-plurisubharmonic function $Phi$ on $X$, such that $omega+sqrt{-1}partialbarpartial Phigeq varepsilonomega$ for some $varepsilon>0$. This is an improvement of results in cite{WZ20}. Examples satisfying the assumption that there exists a holomorphic retraction structure contain product manifolds, thus contains many compact Kahler manifolds which are not necessarily projective.
64 - Tao Luo , Zheng Ma , Zhiwei Wang 2020
A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points. Inspired by the frequency principle in neural networks, we propose a Fourier-domain variational formulation for supervised learni ng problem. This formulation circumvents the difficulty of imposing the constraints of given values on isolated data points in continuum modelling. Under a necessary and sufficient condition within our unified framework, we establish the well-posedness of the Fourier-domain variational problem, by showing a critical exponent depending on the data dimension. In practice, a neural network can be a convenient way to implement our formulation, which automatically satisfies the well-posedness condition.
Discrete event sequences are ubiquitous, such as an ordered event series of process interactions in Information and Communication Technology systems. Recent years have witnessed increasing efforts in detecting anomalies with discrete-event sequences. However, it still remains an extremely difficult task due to several intrinsic challenges including data imbalance issues, the discrete property of the events, and sequential nature of the data. To address these challenges, in this paper, we propose OC4Seq, a multi-scale one-class recurrent neural network for detecting anomalies in discrete event sequences. Specifically, OC4Seq integrates the anomaly detection objective with recurrent neural networks (RNNs) to embed the discrete event sequences into latent spaces, where anomalies can be easily detected. In addition, given that an anomalous sequence could be caused by either individual events, subsequences of events, or the whole sequence, we design a multi-scale RNN framework to capture different levels of sequential patterns simultaneously. Experimental results on three benchmark datasets show that OC4Seq consistently outperforms various representative baselines by a large margin. Moreover, through both quantitative and qualitative analysis, the importance of capturing multi-scale sequential patterns for event anomaly detection is verified.
121 - Bingrong Huang , Yongxiao Lin , 2020
In this note, we give a detailed proof of an asymptotic for averages of coefficients of a class of degree three $L$-functions which can be factorized as a product of a degree one and a degree two $L$-functions. We emphasize that we can break the $1/2 $-barrier in the error term, and we get an explicit exponent.
Let $(X,omega)$ be a compact K{a}hler manifold with a K{a}hler form $omega$ of complex dimension $n$, and $Vsubset X$ is a compact complex submanifold of positive dimension $k<n$. Suppose that $V$ can be embedded in $X$ as a zero section of a holomor phic vector bundle or rank $n-k$ over $V$. Let $varphi$ be a strictly $omega|_V$-psh function on $V$. In this paper, we prove that there is a strictly $omega$-psh function $Phi$ on $X$, such that $Phi|_V=varphi$. This result gives a partial answer to an open problem raised by Collins-Tosatti and Dinew-Guedj-Zeriahi, for the case of K{a}hler currents. We also discuss possible extensions of Kahler currents in a big class.
Robust language processing systems are becoming increasingly important given the recent awareness of dangerous situations where brittle machine learning models can be easily broken with the presence of noises. In this paper, we introduce a robust wor d recognition framework that captures multi-level sequential dependencies in noised sentences. The proposed framework employs a sequence-to-sequence model over characters of each word, whose output is given to a word-level bi-directional recurrent neural network. We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition.
148 - Zhiwei Wang , Yao Ma , Zitao Liu 2019
Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation procedure. Ther efore, many non-recurrent sequence models that are built on convolution and attention operations have been proposed recently. Notably, models with multi-head attention such as Transformer have demonstrated extreme effectiveness in capturing long-term dependencies in a variety of sequence modeling tasks. Despite their success, however, these models lack necessary components to model local structures in sequences and heavily rely on position embeddings that have limited effects and require a considerable amount of design efforts. In this paper, we propose the R-Transformer which enjoys the advantages of both RNNs and the multi-head attention mechanism while avoids their respective drawbacks. The proposed model can effectively capture both local structures and global long-term dependencies in sequences without any use of position embeddings. We evaluate R-Transformer through extensive experiments with data from a wide range of domains and the empirical results show that R-Transformer outperforms the state-of-the-art methods by a large margin in most of the tasks. We have made the code publicly available at url{https://github.com/DSE-MSU/R-transformer}.
Let $X$ be a compact connected CR manifold with a transversal CR $S^1$-action of dimension $2n-1$, which is only assumed to be weakly pseudoconvex. Let $Box_b$ be the $overline{partial}_b$-Laplacian. Eigenvalue estimate of $Box_b$ is a fundamental is sue both in CR geometry and analysis. In this paper, we are able to obtain a sharp estimate of the number of eigenvalues smaller than or equal to $lambda$ of $Box_b$ acting on the $m$-th Fourier components of smooth $(n-1,q)$-forms on $X$, where $min mathbb{Z}_+$ and $q=0,1,cdots, n-1$. Here the sharp means the growth order with respect to $m$ is sharp. In particular, when $lambda=0$, we obtain the asymptotic estimate of the growth for $m$-th Fourier components $H^{n-1,q}_{b,m}(X)$ of $H^{n-1,q}_b(X)$ as $m rightarrow +infty$. Furthermore, we establish a Serre type duality theorem for Fourier components of Kohn-Rossi cohomology which is of independent interest. As a byproduct, the asymptotic growth of the dimensions of the Fourier components $H^{0,q}_{b,-m}(X)$ for $ min mathbb{Z}_+$ is established. Compared with previous results in this field, the estimate for $lambda=0$ already improves very much the corresponding estimate of Hsiao and Li . We also give appilcations of our main results, including Morse type inequalities, asymptotic Riemann-Roch type theorem, Grauert-Riemenscheider type criterion, and an orbifold version of our main results which answers an open problem.
In this paper, we study questions of Demailly and Matsumura on the asymptotic behavior of dimensions of cohomology groups for high tensor powers of (nef) pseudo-effective line bundles over non-necessarily projective algebraic manifolds. By generalizi ng Sius $partialoverline{partial}$-formula and Berndtssons eigenvalue estimate of $overline{partial}$-Laplacian and combining Bonaveros technique, we obtain the following result: given a holomorphic pseudo-effective line bundle $(L, h_L)$ on a compact Hermitian manifold $(X,omega)$, if $h_L$ is a singular metric with algebraic singularities, then $dim H^{q}(X,L^kotimes Eotimes mathcal{I}(h_L^{k}))leq Ck^{n-q}$ for $k$ large, with $E$ an arbitrary holomorphic vector bundle. As applications, we obtain partial solutions to the questions of Demailly and Matsumura.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا