ترغب بنشر مسار تعليمي؟ اضغط هنا

155 - Yun-Bin Zhao , Zhi-Quan Luo 2021
Orthogonal matching pursuit (OMP) is one of the mainstream algorithms for signal reconstruction/approximation. It plays a vital role in the development of compressed sensing theory, and it also acts as a driving force for the development of other heu ristic methods for signal reconstruction. In this paper, we propose the so-called dynamic orthogonal matching pursuit (DOMP) and its two enhanc
Trapped Be+ ions are a leading platform for quantum information science [1], but reactions with background gas species, such as H2 and H2O, result in qubit loss. Our experiment reveals that the BeOH+ ion is the final trapped ion species when both H2 and H2O exist in a vacuum system with cold, trapped Be+. To understand the loss mechanism, low-temperature reactions between sympathetically cooled BeD+ ions and H2O molecules have been investigated using an integrated, laser-cooled Be+ ion trap and high-resolution Time-of-Flight (TOF) mass spectrometer (MS) [2]. Among all the possible products,BeH2O+, H2DO+, BeOD+, and BeOH+, only the BeOH+ molecular ion was observed experimentally, with the assumed co-product of HD. Theoretical analyses based on explicitly correlated restricted coupled cluster singles, doubles, and perturbative triples (RCCSD(T)-F12) method with the augmented correlation-consistent polarized triple zeta (AVTZ) basis set reveal that two intuitive direct abstraction product channels, Be + H2DO+ and D + BeH2O+, are not energetically accessible at the present reaction temperature (~150 K). Instead, a double displacement BeOH+ + HD product channel is accessible due to a large exothermicity of 1.885 eV through a submerged barrier in the reaction pathway. While the BeOD+ + H2 product channel has a similar exothermicity, the reaction pathway is dynamically unfavourable, as suggested by a Sudden Vector Projection analysis. This work sheds light on the origin of the loss and contaminations of the laser-cooled Be+ ions in quantum-information experiments.
In this paper we introduce a light Dirac particle $psi$ as thermal dark matter candidate in a $U(1)_{L_{mu}-L_{tau}}$ model. Together with the new gauge boson $X$, we find a possible parameter space with $m_X simeq 20$ MeV, $U(1)_{L_{mu}-L_{tau}}$ co upling $g_X simeq 5 cdot 10^{-4}$ and $m_psi gtrsim m_X/2$ where the $(g-2)_mu$ anomaly, dark matter, the Hubble tension, and (part of) the excess of $511$ keV photons from the region near the galactic center can be explained simultaneously. This model is safe from current experimental and astrophysical constraints, but can be probed by the next generation of neutrino experiments as well as low-energy $e^+e^-$ colliders.
The optimization problems with a sparsity constraint is a class of important global optimization problems. A typical type of thresholding algorithms for solving such a problem adopts the traditional full steepest descent direction or Newton-like dire ction as a search direction to generate an iterate on which a certain thresholding is performed. Traditional hard thresholding discards a large part of a vector when the vector is dense. Thus a large part of important information contained in a dense vector has been lost in such a thresholding process. Recent study [Zhao, SIAM J Optim, 30(1), pp. 31-55, 2020] shows that the hard thresholding should be applied to a compressible vector instead of a dense vector to avoid a big loss of information. On the other hand, the optimal $k$-thresholding as a novel thresholding technique may overcome the intrinsic drawback of hard thresholding, and performs thresholding and objective function minimization simultaneously. This motivates us to propose the so-called partial gradient optimal thresholding method in this paper, which is an integration of the partial gradient and the optimal $k$-thresholding technique. The solution error bound and convergence for the proposed algorithms have been established in this paper under suitable conditions. Application of our results to the sparse optimization problems arising from signal recovery is also discussed. Experiment results from synthetic data indicate that the proposed algorithm called PGROTP is efficient and comparable to several existing algorithms.
70 - Hanbin Zhao , Xin Qin , Shihao Su 2021
With the rapid development of social media, tremendous videos with new classes are generated daily, which raise an urgent demand for video classification methods that can continuously update new classes while maintaining the knowledge of old videos w ith limited storage and computing resources. In this paper, we summarize this task as Class-Incremental Video Classification (CIVC) and propose a novel framework to address it. As a subarea of incremental learning tasks, the challenge of catastrophic forgetting is unavoidable in CIVC. To better alleviate it, we utilize some characteristics of videos. First, we decompose the spatio-temporal knowledge before distillation rather than treating it as a whole in the knowledge transfer process; trajectory is also used to refine the decomposition. Second, we propose a dual granularity exemplar selection method to select and store representative video instances of old classes and key-frames inside videos under a tight storage budget. We benchmark our method and previous SOTA class-incremental learning methods on Something-Something V2 and Kinetics datasets, and our method outperforms previous methods significantly.
81 - Hui Wang , Hanbin Zhao , 2021
In this paper, we propose a novel image process scheme called class-based expansion learning for image classification, which aims at improving the supervision-stimulation frequency for the samples of the confusing classes. Class-based expansion learn ing takes a bottom-up growing strategy in a class-based expansion optimization fashion, which pays more attention to the quality of learning the fine-grained classification boundaries for the preferentially selected classes. Besides, we develop a class confusion criterion to select the confusing class preferentially for training. In this way, the classification boundaries of the confusing classes are frequently stimulated, resulting in a fine-grained form. Experimental results demonstrate the effectiveness of the proposed scheme on several benchmarks.
132 - Bin Zhao , Xuelong Li 2021
Video frame interpolation can up-convert the frame rate and enhance the video quality. In recent years, although the interpolation performance has achieved great success, image blur usually occurs at the object boundaries owing to the large motion. I t has been a long-standing problem, and has not been addressed yet. In this paper, we propose to reduce the image blur and get the clear shape of objects by preserving the edges in the interpolated frames. To this end, the proposed Edge-Aware Network (EA-Net) integrates the edge information into the frame interpolation task. It follows an end-to-end architecture and can be separated into two stages, emph{i.e.}, edge-guided flow estimation and edge-protected frame synthesis. Specifically, in the flow estimation stage, three edge-aware mechanisms are developed to emphasize the frame edges in estimating flow maps, so that the edge-maps are taken as the auxiliary information to provide more guidance to boost the flow accuracy. In the frame synthesis stage, the flow refinement module is designed to refine the flow map, and the attention module is carried out to adaptively focus on the bidirectional flow maps when synthesizing the intermediate frames. Furthermore, the frame and edge discriminators are adopted to conduct the adversarial training strategy, so as to enhance the reality and clarity of synthesized frames. Experiments on three benchmarks, including Vimeo90k, UCF101 for single-frame interpolation and Adobe240-fps for multi-frame interpolation, have demonstrated the superiority of the proposed EA-Net for the video frame interpolation task.
Audio and vision are two main modalities in video data. Multimodal learning, especially for audiovisual learning, has drawn considerable attention recently, which can boost the performance of various computer vision tasks. However, in video summariza tion, existing approaches just exploit the visual information while neglect the audio information. In this paper, we argue that the audio modality can assist vision modality to better understand the video content and structure, and further benefit the summarization process. Motivated by this, we propose to jointly exploit the audio and visual information for the video summarization task, and develop an AudioVisual Recurrent Network (AVRN) to achieve this. Specifically, the proposed AVRN can be separated into three parts: 1) the two-stream LSTM is utilized to encode the audio and visual feature sequentially by capturing their temporal dependency. 2) the audiovisual fusion LSTM is employed to fuse the two modalities by exploring the latent consistency between them. 3) the self-attention video encoder is adopted to capture the global dependency in the video. Finally, the fused audiovisual information, and the integrated temporal and global dependencies are jointly used to predict the video summary. Practically, the experimental results on the two benchmarks, emph{i.e.,} SumMe and TVsum, have demonstrated the effectiveness of each part, and the superiority of AVRN compared to those approaches just exploiting visual information for video summarization.
Exploiting the inner-shot and inter-shot dependencies is essential for key-shot based video summarization. Current approaches mainly devote to modeling the video as a frame sequence by recurrent neural networks. However, one potential limitation of t he sequence models is that they focus on capturing local neighborhood dependencies while the high-order dependencies in long distance are not fully exploited. In general, the frames in each shot record a certain activity and vary smoothly over time, but the multi-hop relationships occur frequently among shots. In this case, both the local and global dependencies are important for understanding the video content. Motivated by this point, we propose a Reconstructive Sequence-Graph Network (RSGN) to encode the frames and shots as sequence and graph hierarchically, where the frame-level dependencies are encoded by Long Short-Term Memory (LSTM), and the shot-level dependencies are captured by the Graph Convolutional Network (GCN). Then, the videos are summarized by exploiting both the local and global dependencies among shots. Besides, a reconstructor is developed to reward the summary generator, so that the generator can be optimized in an unsupervised manner, which can avert the lack of annotated data in video summarization. Furthermore, under the guidance of reconstruction loss, the predicted summary can better preserve the main video content and shot-level dependencies. Practically, the experimental results on three popular datasets i.e., SumMe, TVsum and VTW) have demonstrated the superiority of our proposed approach to the summarization task.
96 - Nan Meng , Yun-Bin Zhao 2021
Sparse signals can be possibly reconstructed by an algorithm which merges a traditional nonlinear optimization method and a certain thresholding technique. Different from existing thresholding methods, a novel thresholding technique referred to as th e optimal $k$-thresholding was recently proposed by Zhao [SIAM J Optim, 30(1), pp. 31-55, 2020]. This technique simultaneously performs the minimization of an error metric for the problem and thresholding of the iterates generated by the classic gradient method. In this paper, we propose the so-called Newton-type optimal $k$-thresholding (NTOT) algorithm which is motivated by the appreciable performance of both Newton-type methods and the optimal $k$-thresholding technique for signal recovery. The guaranteed performance (including convergence) of the proposed algorithms are shown in terms of suitable choices of the algorithmic parameters and the restricted isometry property (RIP) of the sensing matrix which has been widely used in the analysis of compressive sensing algorithms. The simulation results based on synthetic signals indicate that the proposed algorithms are stable and efficient for signal recovery.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا