ترغب بنشر مسار تعليمي؟ اضغط هنا

131 - Lun Luo , Si-Yuan Cao , Bin Han 2021
Recognizing places using Lidar in large-scale environments is challenging due to the sparse nature of point cloud data. In this paper we present BVMatch, a Lidar-based frame-to-frame place recognition framework, that is capable of estimating 2D relat ive poses. Based on the assumption that the ground area can be approximated as a plane, we uniformly discretize the ground area into grids and project 3D Lidar scans to birds-eye view (BV) images. We further use a bank of Log-Gabor filters to build a maximum index map (MIM) that encodes the orientation information of the structures in the images. We analyze the orientation characteristics of MIM theoretically and introduce a novel descriptor called birds-eye view feature transform (BVFT). The proposed BVFT is insensitive to rotation and intensity variations of BV images. Leveraging the BVFT descriptors, we unify the Lidar place recognition and pose estimation tasks into the BVMatch framework. The experiments conducted on three large-scale datasets show that BVMatch outperforms the state-of-the-art methods in terms of both recall rate of place recognition and pose estimation accuracy.
254 - Difan Zou , Yuan Cao , Yuanzhi Li 2021
Adaptive gradient methods such as Adam have gained increasing popularity in deep learning optimization. However, it has been observed that compared with (stochastic) gradient descent, Adam can converge to a different solution with a significantly wor se test error in many deep learning applications such as image classification, even with a fine-tuned regularization. In this paper, we provide a theoretical explanation for this phenomenon: we show that in the nonconvex setting of learning over-parameterized two-layer convolutional neural networks starting from the same random initialization, for a class of data distributions (inspired from image data), Adam and gradient descent (GD) can converge to different global solutions of the training objective with provably different generalization errors, even with weight decay regularization. In contrast, we show that if the training objective is convex, and the weight decay regularization is employed, any optimization algorithms including Adam and GD will converge to the same solution if the training is successful. This suggests that the inferior generalization performance of Adam is fundamentally tied to the nonconvex landscape of deep learning optimization.
Multispectral and multimodal image processing is important in the community of computer vision and computational photography. As the acquired multispectral and multimodal data are generally misaligned due to the alternation or movement of the image d evice, the image registration procedure is necessary. The registration of multispectral or multimodal image is challenging due to the non-linear intensity and gradient variation. To cope with this challenge, we propose the phase congruency network (PCNet), which is able to enhance the structure similarity and alleviate the non-linear intensity and gradient variation. The images can then be aligned using the similarity enhanced features produced by the network. PCNet is constructed under the guidance of the phase congruency prior. The network contains three trainable layers accompany with the modified learnable Gabor kernels according to the phase congruency theory. Thanks to the prior knowledge, PCNet is extremely light-weight and can be trained on quite a small amount of multispectral data. PCNet can be viewed to be fully convolutional and hence can take input of arbitrary sizes. Once trained, PCNet is applicable on a variety of multispectral and multimodal data such as RGB/NIR and flash/no-flash images without additional further tuning. Experimental results validate that PCNet outperforms current state-of-the-art registration algorithms, including the deep-learning based ones that have the number of parameters hundreds times compared to PCNet. Thanks to the similarity enhancement training, PCNet outperforms the original phase congruency algorithm with two-thirds less feature channels.
132 - Jie Gui , Xiaofeng Cong , Yuan Cao 2021
The presence of haze significantly reduces the quality of images. Researchers have designed a variety of algorithms for image dehazing (ID) to restore the quality of hazy images. However, there are few studies that summarize the deep learning (DL) ba sed dehazing technologies. In this paper, we conduct a comprehensive survey on the recent proposed dehazing methods. Firstly, we summarize the commonly used datasets, loss functions and evaluation metrics. Secondly, we group the existing researches of ID into two major categories: supervised ID and unsupervised ID. The core ideas of various influential dehazing models are introduced. Finally, the open issues for future research on ID are pointed out.
Let $mathbb{F}_{q}$ be the finite field of $q$ elements and let $D_{2n}=langle x,ymid x^n=1, y^2=1, yxy=x^{n-1}rangle$ be the dihedral group of order $n$. Left ideals of the group algebra $mathbb{F}_{q}[D_{2n}]$ are known as left dihedral codes over $mathbb{F}_{q}$ of length $2n$, and abbreviated as left $D_{2n}$-codes. Let ${rm gcd}(n,q)=1$. In this paper, we give an explicit representation for the Euclidean hull of every left $D_{2n}$-code over $mathbb{F}_{q}$. On this basis, we determine all distinct Euclidean LCD codes and Euclidean self-orthogonal codes which are left $D_{2n}$-codes over $mathbb{F}_{q}$. In particular, we provide an explicit representation and a precise enumeration for these two subclasses of left $D_{2n}$-codes and self-dual left $D_{2n}$-codes, respectively. Moreover, we give a direct and simple method for determining the encoder (generator matrix) of any left $D_{2n}$-code over $mathbb{F}_{q}$, and present several numerical examples to illustrative our applications.
We propose a change-point detection method for large scale multiple testing problems with data having clustered signals. Unlike the classic change-point setup, the signals can vary in size within a cluster. The clustering structure on the signals ena bles us to effectively delineate the boundaries between signal and non-signal segments. New test statistics are proposed for observations from one and/or multiple realizations. Their asymptotic distributions are derived. We also study the associated variance estimation problem. We allow the variances to be heteroscedastic in the multiple realization case, which substantially expands the applicability of the proposed method. Simulation studies demonstrate that the proposed approach has a favorable performance. Our procedure is applied to {an array based Comparative Genomic Hybridization (aCGH)} dataset.
We consider a general tractable model for default contagion and systemic risk in a heterogeneous financial network, subject to an exogenous macroeconomic shock. We show that, under some regularity assumptions, the default cascade model could be trans ferred to a death process problem represented by balls-and-bins model. We also reduce the dimension of the problem by classifying banks according to different types, in an appropriate type space. These types may be calibrated to real-world data by using machine learning techniques. We then state various limit theorems regarding the final size of default cascade over different types. In particular, under suitable assumptions on the degree and threshold distributions, we show that the final size of default cascade has asymptotically Gaussian fluctuations. We next state limit theorems for different system-wide wealth aggregation functions and show how the systemic risk measure, in a given stress test scenario, could be related to the structure and heterogeneity of financial networks. We finally show how these results could be used by a social planner to optimally target interventions during a financial crisis, with a budget constraint and under partial information of the financial network.
Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploi ting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses a priori, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of $p$-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.
Moire quantum matter has emerged as a novel materials platform where correlated and topological phases can be explored with unprecedented control. Among them, magic-angle systems constructed from two or three layers of graphene have shown robust supe rconducting phases with unconventional characteristics. However, direct evidence for unconventional pairing remains to be experimentally demonstrated. Here, we show that magic-angle twisted trilayer graphene (MATTG) exhibits superconductivity up to in-plane magnetic fields in excess of 10 T, which represents a large ($2sim3$ times) violation of the Pauli limit for conventional spin-singlet superconductors. This observation is surprising for a system which is not expected to have strong spin-orbit coupling. Furthermore, the Pauli limit violation is observed over the entire superconducting phase, indicating that it is not related to a possible pseudogap phase with large superconducting amplitude pairing. More strikingly, we observe reentrant superconductivity at large magnetic fields, which is present in a narrower range of carrier density and displacement field. These findings suggest that the superconductivity in MATTG is likely driven by a mechanism resulting in non-spin-singlet Cooper pairs, where the external magnetic field can cause transitions between phases with potentially different order parameters. Our results showcase the richness of moire superconductivity and may pave a new route towards designing next-generation exotic quantum matter.
Dialogue state tracking (DST) is a pivotal component in task-oriented dialogue systems. While it is relatively easy for a DST model to capture belief states in short conversations, the task of DST becomes more challenging as the length of a dialogue increases due to the injection of more distracting contexts. In this paper, we aim to improve the overall performance of DST with a special focus on handling longer dialogues. We tackle this problem from three perspectives: 1) A model designed to enable hierarchical slot status prediction; 2) Balanced training procedure for generic and task-specific language understanding; 3) Data perturbation which enhances the models ability in handling longer conversations. We conduct experiments on the MultiWOZ benchmark, and demonstrate the effectiveness of each component via a set of ablation tests, especially on longer conversations.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا