أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yang Xu

Improved Latent Tree Induction with Distant Supervision via Span Constraints

97 - Zhiyang Xu , Andrew Drozdov , Jay Yoon Lee 2021

For over thirty years, researchers have developed and analyzed methods for latent tree induction as an approach for unsupervised syntactic parsing. Nonetheless, modern systems still do not perform well enough compared to their supervised counterparts to have any practical use as structural annotation of text. In this work, we present a technique that uses distant supervision in the form of span constraints (i.e. phrase bracketing) to improve performance in unsupervised constituency parsing. Using a relatively small number of span constraints we can substantially improve the output from DIORA, an already competitive unsupervised parsing system. Compared with full parse tree annotation, span constraints can be acquired with minimal effort, such as with a lexicon derived from Wikipedia, to find exact text matches. Our experiments show span constraints based on entities improves constituency parsing on English WSJ Penn Treebank by more than 5 F1. Furthermore, our method extends to any domain where span constraints are easily attainable, and as a case study we demonstrate its effectiveness by parsing biomedical text from the CRAFT dataset.

الحساب واللغة

Predicting emergent linguistic compositions through time: Syntactic frame extension via multimodal chaining

87 - Lei Yu , Yang Xu 2021

Natural language relies on a finite lexicon to express an unbounded set of emerging ideas. One result of this tension is the formation of new compositions, such that existing linguistic units can be combined with emerging items into novel expressions . We develop a framework that exploits the cognitive mechanisms of chaining and multimodal knowledge to predict emergent compositional expressions through time. We present the syntactic frame extension model (SFEM) that draws on the theory of chaining and knowledge from percept, concept, and language to infer how verbs extend their frames to form new compositions with existing and novel nouns. We evaluate SFEM rigorously on the 1) modalities of knowledge and 2) categorization models of chaining, in a syntactically parsed English corpus over the past 150 years. We show that multimodal SFEM predicts newly emerged verb syntax and arguments substantially better than competing models using purely linguistic or unimodal knowledge. We find support for an exemplar view of chaining as opposed to a prototype view and reveal how the joint approach of multimodal chaining may be fundamental to the creation of literal and figurative language uses including metaphor and metonymy.

الحساب واللغة

Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training

159 - Ming Yan , Haiyang Xu , Chenliang Li 2021

Existing approaches to vision-language pre-training (VLP) heavily rely on an object detector based on bounding boxes (regions), where salient objects are first detected from images and then a Transformer-based model is used for cross-modal fusion. De spite their superior performance, these approaches are bounded by the capability of the object detector in terms of both effectiveness and efficiency. Besides, the presence of object detection imposes unnecessary constraints on model designs and makes it difficult to support end-to-end training. In this paper, we revisit grid-based convolutional features for vision-language pre-training, skipping the expensive region-related steps. We propose a simple yet effective grid-based VLP method that works surprisingly well with the grid features. By pre-training only with in-domain datasets, the proposed Grid-VLP method can outperform most competitive region-based VLP methods on three examined vision-language understanding tasks. We hope that our findings help to further advance the state of the art of vision-language pre-training, and provide a new direction towards effective and efficient VLP.

الوسائط المتعددة الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط

Inelastic Axial and Vector Structure Functions for Lepton-Nucleon Scattering 2021 Update

126 - Arie Bodek , Un Ki Yang , Yang Xu 2021

We report on an update (2021) of a phenomenological model for inelastic neutrino- and electron-nucleon scattering cross sections using effective leading order parton distribution functions with a new scaling variable $xi_w$. Non-perturbative effects are well described using the $xi_w$ scaling variable in combination with multiplicative $K$ factors at low $Q^2$. The model describes all inelastic charged-leptron-nucleon scattering data (HERA/NMC/BCDMS/SLAC/JLab) ranging from very high $Q^2$ to very low $Q^2$ and down to the $Q^2=0$ photo-production region. The model has been developed to be used in analysis of neutrino oscillation experiments in the few GeV region. The 2021 update accounts for the difference between axial and vector structure function which brings it into much better agreement with neutrino-nucleon total cross section measurements. The model has been developed primarily for hadronic final state masses $W$ above 1.8 GeV. However with additional parameters the model also describe the $average$ neutrino cross sections in the resonance region down to $W$=1.4 GeV.

فيزياء الطاقة العالية - الظواهر فيزياء الطاقة العالية - التجربة

Long time asymptotics for the defocusing mKdV equation with finite density initial data in different solitonic regions

147 - Taiyang Xu , Zechuan Zhang , Engui Fan 2021

We investigate the long time asymptotics for the Cauchy problem of the defocusing modified Kortweg-de Vries (mKdV) equation with finite density initial data in different solitonic regions begin{align*} &q_t(x,t)-6q^2(x,t)q_{x}(x,t)+q_{xxx}(x,t)=0, quad (x,t)inmathbb{R}times mathbb{R}^{+}, &q(x,0)=q_{0}(x), quad lim_{xrightarrowpminfty}q_{0}(x)=pm 1, end{align*} where $q_0mp 1in H^{4,4}(mathbb{R})$.Based on the spectral analysis of the Lax pair, we express the solution of the mKdV equation in terms of a Riemann-Hilbert problem. In our previous article, we have obtained long time asymptotics and soliton resolutions for the mKdV equation in the solitonic region $xiin(-6,-2)$ with $xi=frac{x}{t}$.In this paper, we calculate the asymptotic expansion of the solution $q(x,t)$ for the solitonic region $xiin(-varpi,-6)cup(-2,varpi)$ with $ 6 < varpi<infty$ being an arbitrary constant.For $-varpi<xi<-6$, there exist four stationary phase points on jump contour, and the asymptotic approximations can be characterized with an $N$-soliton on discrete spectrums and a leading order term $mathcal{O}(t^{-1/2})$ on continuous spectrum up to a residual error order $mathcal{O}(t^{-3/4})$. For $-2<xi<varpi$, the leading term of asymptotic expansion is described by the soliton solution and the error order $mathcal{O}(t^{-1})$ comes from a $bar{partial}$-problem. Additionally, asymptotic stability can be obtained.

تحليل PDES الفيزياء الرياضية الفيزياء الرياضية

Evolution of emotion semantics

50 - Aotao Xu , Jennifer E. Stellar , Yang Xu 2021

Humans possess the unique ability to communicate emotions through language. Although concepts like anger or awe are abstract, there is a shared consensus about what these English emotion words mean. This consensus may give the impression that their m eaning is static, but we propose this is not the case. We cannot travel back to earlier periods to study emotion concepts directly, but we can examine text corpora, which have partially preserved the meaning of emotion words. Using natural language processing of historical text, we found evidence for semantic change in emotion words over the past century and that varying rates of change were predicted in part by an emotion concepts prototypicality - how representative it is of the broader category of emotion. Prototypicality negatively correlated with historical rates of emotion semantic change obtained from text-based word embeddings, beyond more established variables including usage frequency in English and a second comparison language, French. This effect for prototypicality did not consistently extend to the semantic category of birds, suggesting its relevance for predicting semantic change may be category-dependent. Our results suggest emotion semantics are evolving over time, with prototypical emotion words remaining semantically stable, while other emotion words evolve more freely.

الحساب واللغة

Performance Analysis of a Two-Hop Relaying LoRa System

411 - Wenyang Xu , Guofa Cai , Yi Fang 2021

The conventional LoRa system is not able to sustain long-range communication over fading channels. To resolve the challenging issue, this paper investigates a two-hop opportunistic amplify-and-forward relaying LoRa system. Based on the best relay-sel ection protocol, the analytical and asymptotic bit error rate (BER), achievable diversity order, coverage probability, and throughput of the proposed system are derived over the Nakagamim fading channel. Simulative and numerical results show that although the proposed system reduces the throughput compared to the conventional LoRa system, it can significantly improve BER and coverage probability. Hence, the proposed system can be considered as a promising platform for low-power, long-range and highly reliable wireless-communication applications.

نظرية المعلومات أنظمة وتحكم معالجة الإشارات

PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion

167 - Yu Fu , TianYang Xu , XiaoJun Wu 2021

The Transformer architecture has achieved rapiddevelopment in recent years, outperforming the CNN archi-tectures in many computer vision tasks, such as the VisionTransformers (ViT) for image classification. However, existingvisual transformer models aim to extract semantic informationfor high-level tasks such as classification and detection, distortingthe spatial resolution of the input image, thus sacrificing thecapacity in reconstructing the input or generating high-resolutionimages. In this paper, therefore, we propose a Patch PyramidTransformer(PPT) to effectively address the above issues. Specif-ically, we first design a Patch Transformer to transform theimage into a sequence of patches, where transformer encodingis performed for each patch to extract local representations.In addition, we construct a Pyramid Transformer to effectivelyextract the non-local information from the entire image. Afterobtaining a set of multi-scale, multi-dimensional, and multi-anglefeatures of the original image, we design the image reconstructionnetwork to ensure that the features can be reconstructed intothe original input. To validate the effectiveness, we apply theproposed Patch Pyramid Transformer to the image fusion task.The experimental results demonstrate its superior performanceagainst the state-of-the-art fusion approaches, achieving the bestresults on several evaluation indicators. The underlying capacityof the PPT network is reflected by its universal power in featureextraction and image reconstruction, which can be directlyapplied to different image fusion tasks without redesigning orretraining the network.

الرؤية الحاسوبية وتمييز الأنماط

From Continuity to Editability: Inverting GANs with Consecutive Images

66 - Yangyang Xu , Yong Du , Wenpeng Xiao 2021

Existing GAN inversion methods are stuck in a paradox that the inverted codes can either achieve high-fidelity reconstruction, or retain the editing capability. Having only one of them clearly cannot realize real image editing. In this paper, we reso lve this paradox by introducing consecutive images (eg, video frames or the same person with different poses) into the inversion process. The rationale behind our solution is that the continuity of consecutive images leads to inherent editable directions. This inborn property is used for two unique purposes: 1) regularizing the joint inversion process, such that each of the inverted code is semantically accessible from one of the other and fastened in a editable domain; 2) enforcing inter-image coherence, such that the fidelity of each inverted code can be maximized with the complement of other images. Extensive experiments demonstrate that our alternative significantly outperforms state-of-the-art methods in terms of reconstruction fidelity and editability on both the real image dataset and synthesis dataset. Furthermore, our method provides the first support of video-based GAN inversion, and an interesting application of unsupervised semantic transfer from consecutive images. Source code can be found at: url{https://github.com/cnnlstm/InvertingGANs_with_ConsecutiveImgs}.

الرؤية الحاسوبية وتمييز الأنماط

Distributed stochastic inertial methods with delayed derivatives

88 - Yangyang Xu , Yibo Xu , Yonggui Yan 2021

Stochastic gradient methods (SGMs) are predominant approaches for solving stochastic optimization. On smooth nonconvex problems, a few acceleration techniques have been applied to improve the convergence rate of SGMs. However, little exploration has been made on applying a certain acceleration technique to a stochastic subgradient method (SsGM) for nonsmooth nonconvex problems. In addition, few efforts have been made to analyze an (accelerated) SsGM with delayed derivatives. The information delay naturally happens in a distributed system, where computing workers do not coordinate with each other. In this paper, we propose an inertial proximal SsGM for solving nonsmooth nonconvex stochastic optimization problems. The proposed method can have guaranteed convergence even with delayed derivative information in a distributed environment. Convergence rate results are established to three classes of nonconvex problems: weakly-convex nonsmooth problems with a convex regularizer, composite nonconvex problems with a nonsmooth convex regularizer, and smooth nonconvex problems. For each problem class, the convergence rate is $O(1/K^{frac{1}{2}})$ in the expected value of the gradient norm square, for $K$ iterations. In a distributed environment, the convergence rate of the proposed method will be slowed down by the information delay. Nevertheless, the slow-down effect will decay with the number of iterations for the latter two problem classes. We test the proposed method on three applications. The numerical results clearly demonstrate the advantages of using the inertial-based acceleration. Furthermore, we observe higher parallelization speed-up in asynchronous updates over the synchronous counterpart, though the former uses delayed derivatives.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية التحليل العددي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد