أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Jie Wu

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

87 - Mahsa Yarmohammadi , Shijie Wu , Marc Marone 2021

Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English. While the advance of pretrained multilingual enc oders suggests an easy optimism of train on English, run on any language, we find through a thorough exploration and extension of techniques that a combination of approaches, both new and old, leads to better performance than any one cross-lingual strategy in particular. We explore techniques including data projection and self-training, and how different pretrained encoders impact them. We use English-to-Arabic IE as our initial example, demonstrating strong performance in this setting for event extraction, named entity recognition, part-of-speech tagging, and dependency parsing. We then apply data projection and self-training to three tasks across eight target languages. Because no single set of techniques performs the best across all tasks, we encourage practitioners to explore various configurations of the techniques described in this work when seeking to improve on zero-shot training.

الحساب واللغة

On some sums involving the integral part function

108 - Kui Liu , Jie Wu , Zhishan Yang 2021

Denote by $tau$ k (n), $omega$(n) and $mu$ 2 (n) the number of representations of n as product of k natural numbers, the number of distinct prime factors of n and the characteristic function of the square-free integers, respectively. Let [t] be the i ntegral part of real number t. For f = $omega$, 2 $omega$ , $mu$ 2 , $tau$ k , we prove that n x f x n = x d 1 f (d) d(d + 1) + O $epsilon$ (x $theta$ f +$epsilon$) for x $rightarrow$ $infty$, where $theta$ $omega$ = 53 110 , $theta$ 2 $omega$ = 9 19 , $theta$ $mu$2 = 2 5 , $theta$ $tau$ k = 5k--1 10k--1 and $epsilon$ > 0 is an arbitrarily small positive number. These improve the corresponding results of Bordell{`e}s.

نظرية الأعداد

PAENet: A Progressive Attention-Enhanced Network for 3D to 2D Retinal Vessel Segmentation

120 - Zhuojie Wu , Muyi Sun 2021

3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography (OCTA) images. Accurate retinal vessel segmentation is important for the diagnosis and prevention of ophthalmic diseases. However, making full u se of the 3D data of OCTA volumes is a vital factor for obtaining satisfactory segmentation results. In this paper, we propose a Progressive Attention-Enhanced Network (PAENet) based on attention mechanisms to extract rich feature representation. Specifically, the framework consists of two main parts, the three-dimensional feature learning path and the two-dimensional segmentation path. In the three-dimensional feature learning path, we design a novel Adaptive Pooling Module (APM) and propose a new Quadruple Attention Module (QAM). The APM captures dependencies along the projection direction of volumes and learns a series of pooling coefficients for feature fusion, which efficiently reduces feature dimension. In addition, the QAM reweights the features by capturing four-group cross-dimension dependencies, which makes maximum use of 4D feature tensors. In the two-dimensional segmentation path, to acquire more detailed information, we propose a Feature Fusion Module (FFM) to inject 3D information into the 2D path. Meanwhile, we adopt the Polarized Self-Attention (PSA) block to model the semantic interdependencies in spatial and channel dimensions respectively. Experimentally, our extensive experiments on the OCTA-500 dataset show that our proposed algorithm achieves state-of-the-art performance compared with previous methods.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Online Multi-Granularity Distillation for GAN Compression

145 - Yuxi Ren , Jie Wu , Xuefeng Xiao 2021

Generative Adversarial Networks (GANs) have witnessed prevailing success in yielding outstanding images, however, they are burdensome to deploy on resource-constrained devices due to ponderous computational costs and hulking memory usage. Although re cent efforts on compressing GANs have acquired remarkable results, they still exist potential model redundancies and can be further compressed. To solve this issue, we propose a novel online multi-granularity distillation (OMGD) scheme to obtain lightweight GANs, which contributes to generating high-fidelity images with low computational demands. We offer the first attempt to popularize single-stage online distillation for GAN-oriented compression, where the progressively promoted teacher generator helps to refine the discriminator-free based student generator. Complementary teacher generators and network layers provide comprehensive and multi-granularity concepts to enhance visual fidelity from diverse dimensions. Experimental results on four benchmark datasets demonstrate that OMGD successes to compress 40x MACs and 82.5X parameters on Pix2Pix and CycleGAN, without loss of image quality. It reveals that OMGD provides a feasible solution for the deployment of real-time image translation on resource-constrained devices. Our code and models are made public at: https://github.com/bytedance/OMGD.

الرؤية الحاسوبية وتمييز الأنماط

Classification of solutions of the 2D steady Navier-Stokes equations with separated variables in cone-like domains

133 - Wendong Wang , Jie Wu 2021

We investigate the problem of classification of solutions for the steady Navier-Stokes equations in any cone-like domains. In the form of separated variables, $$u(x,y)=left( begin{array}{c} varphi_1(r)v_1(theta) varphi_2(r)v_2(theta) end{arra y} right) ,$$ where $x=rcostheta$ and $y=rsintheta$ in polar coordinates, we obtain the expressions of all smooth solutions with $C^0$ Dirichlet boundary condition. In particular, it shows that (i) some solutions are found, which are H{o}lder continuous on the boundary, but their gradients blow up at the corner; (ii) all solutions in the entire plane of $mathbb{R}^2$ like harmonic functions or Stokes equations, are polynomial expressions.

تحليل PDES

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

162 - Jie Wu , Wei Zhang , Guanbin Li 2021

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video. Specifically, given an untrimmed video, WSSTAD aims to localize a spatio-temporal tube (i.e., a sequence of b ounding boxes at consecutive times) that encloses the abnormal event, with only coarse video-level annotations as supervision during training. To address this challenging task, we propose a dual-branch network which takes as input the proposals with multi-granularities in both spatial-temporal domains. Each branch employs a relationship reasoning module to capture the correlation between tubes/videolets, which can provide rich contextual information and complex entity relationships for the concept learning of abnormal behaviors. Mutually-guided Progressive Refinement framework is set up to employ dual-path mutual guidance in a recurrent manner, iteratively sharing auxiliary supervision information across branches. It impels the learned concepts of each branch to serve as a guide for its counterpart, which progressively refines the corresponding branch and the whole framework. Furthermore, we contribute two datasets, i.e., ST-UCF-Crime and STRA, consisting of videos containing spatio-temporal abnormal annotations to serve as the benchmarks for WSSTAD. We conduct extensive qualitative and quantitative evaluations to demonstrate the effectiveness of the proposed approach and analyze the key factors that contribute more to handle this task.

الرؤية الحاسوبية وتمييز الأنماط

Variational quantum process tomography

85 - Shichuan Xue , Yong Liu , Yang Wang 2021

Quantum process tomography is an experimental technique to fully characterize an unknown quantum process. Standard quantum process tomography suffers from exponentially scaling of the number of measurements with the increasing system size. In this wo rk, we put forward a quantum machine learning algorithm which approximately encodes the unknown unitary quantum process into a relatively shallow depth parametric quantum circuit. We demonstrate our method by reconstructing the unitary quantum processes resulting from the quantum Hamiltonian evolution and random quantum circuits up to $8$ qubits. Results show that those quantum processes could be reconstructed with high fidelity, while the number of input states required are at least $2$ orders of magnitude less than required by the standard quantum process tomography.

فيزياء الكم

FloorPP-Net: Reconstructing Floor Plans using Point Pillars for Scan-to-BIM

37 - Yijie Wu , Fan Xue 2021

This paper presents a deep learning-based point cloud processing method named FloorPP-Net for the task of Scan-to-BIM (building information model). FloorPP-Net first converts the input point cloud of a building story into point pillars (PP), then pre dicts the corners and edges to output the floor plan. Altogether, FloorPP-Net establishes an end-to-end supervised learning framework for the Scan-to-Floor-Plan (Scan2FP) task. In the 1st International Scan-to-BIM Challenge held in conjunction with CVPR 2021, FloorPP-Net was ranked the second runner-up in the floor plan reconstruction track. Future work includes general edge proposals, 2D plan regularization, and 3D BIM reconstruction.

الرؤية الحاسوبية وتمييز الأنماط

Literature review on vulnerability detection using NLP technology

59 - Jiajie Wu 2021

Vulnerability detection has always been the most important task in the field of software security. With the development of technology, in the face of massive source code, automated analysis and detection of vulnerabilities has become a current resear ch hotspot. For special text files such as source code, using some of the hottest NLP technologies to build models and realize the automatic analysis and detection of source code has become one of the most anticipated studies in the field of vulnerability detection. This article does a brief survey of some recent new documents and technologies, such as CodeBERT, and summarizes the previous technologies.

التشفير والأمن الذكاء الاصطناعي هندسة البرمجيات

Do Explicit Alignments Robustly Improve Multilingual Encoders?

99 - Shijie Wu , Mark Dredze 2020

Multilingual BERT (mBERT), XLM-RoBERTa (XLMR) and other unsupervised multilingual encoders can effectively learn cross-lingual representation. Explicit alignment objectives based on bitexts like Europarl or MultiUN have been shown to further improve these representations. However, word-level alignments are often suboptimal and such bitexts are unavailable for many languages. In this paper, we propose a new contrastive alignment objective that can better utilize such signal, and examine whether these previous alignment methods can be adapted to noisier sources of aligned data: a randomly sampled 1 million pair subset of the OPUS collection. Additionally, rather than report results on a single dataset with a single model run, we report the mean and standard derivation of multiple runs with different seeds, on four datasets and tasks. Our more extensive analysis finds that, while our new objective outperforms previous work, overall these methods do not improve performance with a more robust evaluation framework. Furthermore, the gains from using a better underlying model eclipse any benefits from alignment training. These negative results dictate more care in evaluating these methods and suggest limitations in applying explicit alignment objectives.

الحساب واللغة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد