أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Xiaodong Liu

Identification of Electromagnetic Dipoles from Multi-frequency Sparse Electric Far Field Patterns

102 - Jialei Li , Xiaodong Liu 2021

The inverse electromagnetic source scattering problem from multi-frequency sparse electric far field patterns is considered. The underlying source is a combination of electric dipoles and magnetic dipoles. We show that the locations and the polarizat ion strengths of the dipoles can be uniquely determined by the multi-frequency electric far field patterns at sparse observation directions. The unique arguments rely on some geometrical discussions and ingenious integrals of the electric far field patterns with properly chosen functions. Motivated by the uniqueness proof, we introduce two indicator functions for locating the magnetic dipoles and the electric dipoles, respectively. Having located all the dipoles, the formulas for computing the corresponding polarization strengths are proposed. Finally, some numerical examples are presented to show the validity and robustness of the proposed algorithm.

تحليل PDES الفيزياء الكلاسيكية

A multi-frequency sampling method for the inverse source problems with sparse measurements

150 - Xiaodong Liu , Shixu Meng 2021

We consider the inverse source problems with multi-frequency sparse near field measurements. In contrast to the existing near field operator based on the integral over the space variable, a multi-frequency near field operator is introduced based on t he integral over the frequency variable. A factorization of this multi-frequency near field operator is further given and analysed. Motivated by such a factorization, we introduce a multi-frequency sampling method to reconstruct the source support. Its theoretical foundation is then derived from the properties of the factorized operators and a properly chosen point spread function. Numerical examples are provided to illustrate the multi-frequency sampling method with sparse near field measurements. Finally we briefly discuss how to extend the near field case to the far field case.

التحليل العددي التحليل العددي

Modified sampling method with near field measurements

92 - Xiaodong Liu , Shixu Meng , Bo Zhang 2021

This paper investigates the inverse scattering problems using sampling methods with near field measurements. The near field measurements appear in two classical inverse scattering problems: the inverse scattering for obstacles and the interior invers e scattering for cavities. We propose modified sampling methods to treat these two classical problems using near field measurements without making any asymptotic assumptions on the distance between the measurement surface and the scatterers. We provide theoretical justifications based on the factorization of the near field operator in both symmetric factorization case and non-symmetric factorization case. Furthermore, we introduce a data completion algorithm which allows us to apply the modified sampling methods to treat the limited-aperture inverse scattering problems. Finally numerical examples are provided to illustrate the modified sampling methods with both full- and limited- aperture near field measurements.

التحليل العددي التحليل العددي تحليل PDES

Data completion algorithms and their applications in inverse acoustic scattering with limited-aperture backscattering data

112 - Fangfang Dou , Xiaodong Liu , Shixu Meng 2021

We introduce two data completion algorithms for the limited-aperture problems in inverse acoustic scattering. Both completion algorithms are independent of the topological and physical properties of the unknown scatterers. The main idea is to relate the limited-aperture data to the full-aperture data via the prolate matrix. The data completion algorithms are simple and fast since only the approximate inversion of the prolate matrix is involved. We then combine the data completion algorithms with imaging methods such as factorization method and direct sampling method for the object reconstructions. A variety of numerical examples are presented to illustrate the effectiveness and robustness of the proposed algorithms.

تحليل PDES التحليل العددي التحليل العددي

Targeted Adversarial Training for Natural Language Understanding

112 - Lis Pereira , Xiaodong Liu , Hao Cheng 2021

We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding. The key idea is to introspect current mistakes and prioritize adversarial training steps to where the model errs the most. Experiments show that TAT can significantly improve accuracy over standard adversarial training on GLUE and attain new state-of-the-art zero-shot results on XNLI. Our code will be released at: https://github.com/namisan/mt-dnn.

الحساب واللغة

ConfInLog: Leveraging Software Logs to Infer Configuration Constraints

227 - Shulin Zhou , Xiaodong Liu , Shanshan Li 2021

Misconfigurations have become the dominant causes of software failures in recent years, drawing tremendous attention for their increasing prevalence and severity. Configuration constraints can preemptively avoid misconfiguration by defining the condi tions that configuration options should satisfy. Documentation is the main source of configuration constraints, but it might be incomplete or inconsistent with the source code. In this regard, prior researches have focused on obtaining configuration constraints from software source code through static analysis. However, the difficulty in pointer analysis and context comprehension prevents them from collecting accurate and comprehensive constraints. In this paper, we observed that software logs often contain configuration constraints. We conducted an empirical study and summarized patterns of configuration-related log messages. Guided by the study, we designed and implemented ConfInLog, a static tool to infer configuration constraints from log messages. ConfInLog first selects configuration-related log messages from source code by using the summarized patterns, then infers constraints from log messages based on the summarized natural language patterns. To evaluate the effectiveness of ConfInLog, we applied our tool on seven popular open-source software systems. ConfInLog successfully inferred 22 to 163 constraints, in which 59.5% to 61.6% could not be inferred by the state-of-the-art work. Finally, we submitted 67 documentation patches regarding the constraints inferred by ConfInLog. The constraints in 29 patches have been confirmed by the developers, among which 10 patches have been accepted.

هندسة البرمجيات

Token-wise Curriculum Learning for Neural Machine Translation

101 - Chen Liang , Haoming Jiang , Xiaodong Liu 2021

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of easy samples from training data at the early training stage. This is not always achievable for low-resource languages where the amount of training data is limited. To address such limitation, we propose a novel token-wise curriculum learning approach that creates sufficient amounts of easy samples. Specifically, the model learns to predict a short sub-sequence from the beginning part of each target sentence at the early stage of training, and then the sub-sequence is gradually expanded as the training progresses. Such a new curriculum design is inspired by the cumulative effect of translation errors, which makes the latter tokens more difficult to predict than the beginning ones. Extensive experiments show that our approach can consistently outperform baselines on 5 language pairs, especially for low-resource languages. Combining our approach with sentence-level methods further improves the performance on high-resource languages.

الحساب واللغة التعلم الآلي

Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering

382 - Yuning Mao , Pengcheng He , Xiaodong Liu 2021

Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer. In this paper, we propose a simple and effective passage reranking method, named Reader-guIDEd Reranker (RIDER), which does not involve training and reranks the retrieved passages solely based on the top predictions of the reader before reranking. We show that RIDER, despite its simplicity, achieves 10 to 20 absolute gains in top-1 retrieval accuracy and 1 to 4 Exact Match (EM) gains without refining the retriever or reader. In addition, RIDER, without any training, outperforms state-of-the-art transformer-based supervised rerankers. Remarkably, RIDER achieves 48.3 EM on the Natural Questions dataset and 66.4 EM on the TriviaQA dataset when only 1,024 tokens (7.8 passages on average) are used as the reader input after passage reranking.

الحساب واللغة الذكاء الاصطناعي استرجاع المعلومات

UnitedQA: A Hybrid Approach for Open Domain Question Answering

238 - Hao Cheng , Yelong Shen , Xiaodong Liu 2021

To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively. In this paper, we study a hybrid approach for leveraging the strengths of both models. We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models, and find that proper training methods can provide large improvement over previous state-of-the-art models. We demonstrate that a simple hybrid approach by combining answers from both readers can efficiently take advantages of extractive and generative answer inference strategies and outperforms single models as well as homogeneous ensembles. Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.

الحساب واللغة الذكاء الاصطناعي

Generation-Augmented Retrieval for Open-domain Question Answering

645 - Yuning Mao , Pengcheng He , Xiaodong Liu 2020

We propose Generation-Augmented Retrieval (GAR) for answering open-domain questions, which augments a query through text generation of heuristically discovered relevant contexts without external resources as supervision. We demonstrate that the gener ated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR. We show that generating diverse contexts for a query is beneficial as fusing their results consistently yields better retrieval accuracy. Moreover, as sparse and dense representations are often complementary, GAR can be easily combined with DPR to achieve even better performance. GAR achieves state-of-the-art performance on Natural Questions and TriviaQA datasets under the extractive QA setup when equipped with an extractive reader, and consistently outperforms other retrieval methods when the same generative reader is used.

الحساب واللغة استرجاع المعلومات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد