أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Jing Lu

Towards Robust Cross-domain Image Understanding with Unsupervised Noise Removal

377 - Lei Zhu , Zhaojing Luo , Wei Wang 2021

Deep learning models usually require a large amount of labeled data to achieve satisfactory performance. In multimedia analysis, domain adaptation studies the problem of cross-domain knowledge transfer from a label rich source domain to a label scarc e target domain, thus potentially alleviates the annotation requirement for deep learning models. However, we find that contemporary domain adaptation methods for cross-domain image understanding perform poorly when source domain is noisy. Weakly Supervised Domain Adaptation (WSDA) studies the domain adaptation problem under the scenario where source data can be noisy. Prior methods on WSDA remove noisy source data and align the marginal distribution across domains without considering the fine-grained semantic structure in the embedding space, which have the problem of class misalignment, e.g., features of cats in the target domain might be mapped near features of dogs in the source domain. In this paper, we propose a novel method, termed Noise Tolerant Domain Adaptation, for WSDA. Specifically, we adopt the cluster assumption and learn cluster discriminatively with class prototypes in the embedding space. We propose to leverage the location information of the data points in the embedding space and model the location information with a Gaussian mixture model to identify noisy source data. We then design a network which incorporates the Gaussian mixture noise model as a sub-module for unsupervised noise removal and propose a novel cluster-level adversarial adaptation method which aligns unlabeled target data with the less noisy class prototypes for mapping the semantic structure across domains. We conduct extensive experiments to evaluate the effectiveness of our method on both general images and medical images from COVID-19 and e-commerce datasets. The results show that our method significantly outperforms state-of-the-art WSDA methods.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Dense Eulerian graphs are $(1, 3)$-choosable

106 - Huajing Lu , Xuding Zhu 2021

A graph $G$ is total weight $(k,k)$-choosable if for any total list assignment $L$ which assigns to each vertex $v$ a set $L(v)$ of $k$ real numbers, and each edge $e$ a set $L(e)$ of $k$ real numbers, there is a proper total $L$-weighting, i.e., a m apping $f: V(G) cup E(G) to mathbb{R}$ such that for each $z in V(G) cup E(G)$, $f(z) in L(z)$, and for each edge $uv$ of $G$, $sum_{e in E(u)}f(e)+f(u) e sum_{e in E(v)}f(e) + f(v)$. This paper proves that if $G$ decomposes into complete graphs of odd order, then $G$ is total weight $(1,3)$-choosable. As a consequence, every Eulerian graph $G$ of large order and with minimum degree at least $0.91|V(G)|$ is total weight $(1,3)$-choosable. We also prove that any graph $G$ with minimum degree at least $0.999|V(G)|$ is total weight $(1,4)$-choosable.

التوافقية

ICDAR 2021 Competition on Scene Video Text Spotting

100 - Zhanzhan Cheng , Jing Lu , Baorui Zou 2021

Scene video text spotting (SVTS) is a very important research topic because of many real-life applications. However, only a little effort has put to spotting scene video text, in contrast to massive studies of scene text spotting in static images. Du e to various environmental interferences like motion blur, spotting scene video text becomes very challenging. To promote this research area, this competition introduces a new challenge dataset containing 129 video clips from 21 natural scenarios in full annotations. The competition containts three tasks, that is, video text detection (Task 1), video text tracking (Task 2) and end-to-end video text spotting (Task3). During the competition period (opened on 1st March, 2021 and closed on 11th April, 2021), a total of 24 teams participated in the three proposed tasks with 46 valid submissions, respectively. This paper includes dataset descriptions, task definitions, evaluation protocols and results summaries of the ICDAR 2021 on SVTS competition. Thanks to the healthy number of teams as well as submissions, we consider that the SVTS competition has been successfully held, drawing much attention from the community and promoting the field research and its development.

الرؤية الحاسوبية وتمييز الأنماط

FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark

92 - Liang Xu , Xiaojing Lu , Chenyang Yuan 2021

Pretrained Language Models (PLMs) have achieved tremendous success in natural language understanding tasks. While different learning schemes -- fine-tuning, zero-shot and few-shot learning -- have been widely explored and compared for languages such as English, there is comparatively little work in Chinese to fairly and comprehensively evaluate and compare these methods. This work first introduces Chinese Few-shot Learning Evaluation Benchmark (FewCLUE), the first comprehensive small sample evaluation benchmark in Chinese. It includes nine tasks, ranging from single-sentence and sentence-pair classification tasks to machine reading comprehension tasks. Given the high variance of the few-shot learning performance, we provide multiple training/validation sets to facilitate a more accurate and stable evaluation of few-shot modeling. An unlabeled training set with up to 20,000 additional samples per task is provided, allowing researchers to explore better ways of using unlabeled samples. Next, we implement a set of state-of-the-art (SOTA) few-shot learning methods (including PET, ADAPET, LM-BFF, P-tuning and EFL), and compare their performance with fine-tuning and zero-shot learning schemes on the newly constructed FewCLUE benchmark.Our results show that: 1) all five few-shot learning methods exhibit better performance than fine-tuning or zero-shot learning; 2) among the five methods, PET is the best performing few-shot method; 3) few-shot learning performance is highly dependent on the specific task. Our benchmark and code are available at https://github.com/CLUEbenchmark/FewCLUE

الحساب واللغة الذكاء الاصطناعي

Characteristics and Challenges of Low-Code Development: The Practitioners Perspective

155 - Yajing Luo , Peng Liang , Chong Wang 2021

Background: In recent years, Low-code development (LCD) is growing rapidly, and Gartner and Forrester have predicted that the use of LCD is very promising. Giant companies, such as Microsoft, Mendix, and Outsystems have also launched their LCD platfo rms. Aim: In this work, we explored two popular online developer communities, Stack Overflow (SO) and Reddit, to provide insights on the characteristics and challenges of LCD from a practitioners perspective. Method: We used two LCD related terms to search the relevant posts in SO and extracted 73 posts. Meanwhile, we explored three LCD related subreddits from Reddit and collected 228 posts. We extracted data from these posts and applied the Constant Comparison method to analyze the descriptions, benefits, and limitations and challenges of LCD. For platforms and programming languages used in LCD, implementation units in LCD, supporting technologies of LCD, types of applications developed by LCD, and domains that use LCD, we used descriptive statistics to analyze and present the results. Results: Our findings show that: (1) LCD may provide a graphical user interface for users to drag and drop with little or even no code; (2) the equipment of out-of-the-box units (e.g., APIs and components) in LCD platforms makes them easy to learn and use as well as speeds up the development; (3) LCD is particularly favored in the domains that have the need for automated processes and workflows; and (4) practitioners have conflicting views on the advantages and disadvantages of LCD. Conclusions: Our findings suggest that researchers should clearly define the terms when they refer to LCD, and developers should consider whether the characteristics of LCD are appropriate for their projects.

هندسة البرمجيات

Profile changes associated with DM events in PSR J1713+0747

109 - Fang Xi Lin , Hsiu-Hsien Lin , Jing Luo 2021

Propagation effects in the interstellar medium and intrinsic profile changes can cause variability in the timing of pulsars, which limits the accuracy of fundamental science done via pulsar timing. One of the best timing pulsars, PSR J1713+0747, has gone through two ``dip events in its dispersion measure time series. If these events reflect real changes in electron column density, they should lead to multiple imaging. We show that the events are are well-fit by an underdense corrugated sheet model, and look for associated variability in the pulse profile using principal component analysis. We find that there are transient pulse profile variations, but they vary in concert with the dispersion measure, unlike what is expected from lensing due to a corrugated sheet. The change is consistent in shape across profiles from both the Greenbank and Arecibo radio observatories, and its amplitude appears to be achromatic across the 820 MHz, 1.4 GHz, and 2.3 GHz bands, again unlike expected from interference between lensed images. This result is puzzling. We note that some of the predicted lensing effects would need higher time and frequency resolution data than used in this analysis. Future events appear likely, and storing baseband data or keeping multiple time-frequency resolutions will allow more in-depth study of propagation effects and hence improvements to pulsar timing accuracy.

ظاهرة عالية الطاقة الفيزياء الفيزيائية

Few-photon optical diode in a chiral waveguide

380 - Jin-Lei Tan , Xun-Wei Xu , Jing Lu 2021

We study the coherent transport of one or two photons in a 1D waveguide chirally coupled to a nonlinear resonator. Analytic solutions of the one-photon and two-photon scattering is derived. Although the resonator acts as a non-reciprocal phase shifte r, light transmission is reciprocal at one-photon level. However, the forward and reverse transmitted probabilities for two photons incident from either the left side or the right side of the nonlinear resonator are nonreciprocal due to the energy redistribution of the two-photon bound state. Hence, the nonlinear resonator acts as an optical diode at two-photon level.

فيزياء الكم

On the approximation of queue-length distributions in transportation networks

52 - Jing Lu 2021

This paper focuses on the analytical probabilistic modeling of vehicular traffic. It formulates a stochastic node model. It then formulates a network model by coupling the node model with the link model of Lu and Osorio (2018), which is a stochastic formulation of the traffic-theoretic link transmission model. The proposed network model is scalable and computationally efficient, making it suitable for urban network optimization. For a network with $r$ links, each of space capacity $ell$, the model has a complexity of $mathcal{O}(rell)$. The network model yields the marginal distribution of link states. The model is validated versus a simulation-based network implementation of the stochastic link transmission model. The validation experiments consider a set of small network with intricate traffic dynamics. For all scenarios, the proposed model accurately captures the traffic dynamics. The network model is used to address a signal control problem. Compared to the probabilistic link model of Lu and Osorio (2018) with an exogenous node model and a benchmark deterministic network loading model, the proposed network model derives signal plans with better performance. The case study highlights the added value of using between-link (i.e., across-node) interaction information for traffic management and accounting for stochasticity in the network.

علوم الكمبيوتر ونظرية الألعاب أنظمة وتحكم أنظمة وتحكم

WakaVT: A Sequential Variational Transformer for Waka Generation

76 - Yuka Takeishi , Mingxuan Niu , Jing Luo 2021

Poetry generation has long been a challenge for artificial intelligence. In the scope of Japanese poetry generation, many researchers have paid attention to Haiku generation, but few have focused on Waka generation. To further explore the creative po tential of natural language generation systems in Japanese poetry creation, we propose a novel Waka generation model, WakaVT, which automatically produces Waka poems given user-specified keywords. Firstly, an additive mask-based approach is presented to satisfy the form constraint. Secondly, the structures of Transformer and variational autoencoder are integrated to enhance the quality of generated content. Specifically, to obtain novelty and diversity, WakaVT employs a sequence of latent variables, which effectively captures word-level variability in Waka data. To improve linguistic quality in terms of fluency, coherence, and meaningfulness, we further propose the fused multilevel self-attention mechanism, which properly models the hierarchical linguistic structure of Waka. To the best of our knowledge, we are the first to investigate Waka generation with models based on Transformer and/or variational autoencoder. Both objective and subjective evaluation results demonstrate that our model outperforms baselines significantly.

الحساب واللغة الذكاء الاصطناعي

Probabilistic Analogical Mapping with Semantic Relation Networks

61 - Hongjing Lu , Nicholas Ichien , Keith J. Holyoak 2021

The human ability to flexibly reason using analogies with domain-general content depends on mechanisms for identifying relations between concepts, and for mapping concepts and their relations across analogs. Building on a recent model of how semantic relations can be learned from non-relational word embeddings, we present a new computational model of mapping between two analogs. The model adopts a Bayesian framework for probabilistic graph matching, operating on semantic relation networks constructed from distributed representations of individual concepts and of relations between concepts. Through comparisons of model predictions with human performance in a novel mapping task requiring integration of multiple relations, as well as in several classic studies, we demonstrate that the model accounts for a broad range of phenomena involving analogical mapping by both adults and children. We also show the potential for extending the model to deal with analog retrieval. Our approach demonstrates that human-like analogical mapping can emerge from comparison mechanisms applied to rich semantic representations of individual concepts and relations.

الذكاء الاصطناعي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد