أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yuan Yuan

GEDIT: Geographic-Enhanced and Dependency-Guided Tagging for Joint POI and Accessibility Extraction at Baidu Maps

85 - Yibo Sun , Jizhou Huang , Chunyuan Yuan 2021

Providing timely accessibility reminders of a point-of-interest (POI) plays a vital role in improving user satisfaction of finding places and making visiting decisions. However, it is difficult to keep the POI database in sync with the real-world cou nterparts due to the dynamic nature of business changes. To alleviate this problem, we formulate and present a practical solution that jointly extracts POI mentions and identifies their coupled accessibility labels from unstructured text. We approach this task as a sequence tagging problem, where the goal is to produce <POI name, accessibility label> pairs from unstructured text. This task is challenging because of two main issues: (1) POI names are often newly-coined words so as to successfully register new entities or brands and (2) there may exist multiple pairs in the text, which necessitates dealing with one-to-many or many-to-one mapping to make each POI coupled with its accessibility label. To this end, we propose a Geographic-Enhanced and Dependency-guIded sequence Tagging (GEDIT) model to concurrently address the two challenges. First, to alleviate challenge #1, we develop a geographic-enhanced pre-trained model to learn the text representations. Second, to mitigate challenge #2, we apply a relational graph convolutional network to learn the tree node representations from the parsed dependency tree. Finally, we construct a neural sequence tagging model by integrating and feeding the previously pre-learned representations into a CRF layer. Extensive experiments conducted on a real-world dataset demonstrate the superiority and effectiveness of GEDIT. In addition, it has already been deployed in production at Baidu Maps. Statistics show that the proposed solution can save significant human effort and labor costs to deal with the same amount of documents, which confirms that it is a practical way for POI accessibility maintenance.

الحساب واللغة

Simple Formulas for Output Interception Power Estimation of Uni-Traveling Carrier Photodiodes

79 - Keye Sun , Junyi Gao , Yuan Yuan 2021

Simple analytical expressions for estimation of second order output intercept point (OIP2) and third order output intercept point (OIP3) of surface normal uni-traveling carrier (UTC) and modified uni-traveling carrier (MUTC) photodiode (PD) are deriv ed. These equations are valuable for estimation of OIP for high power (M)UTC-PDs during the design phase.

الفيزياء التطبيقية

Full band Monte Carlo simulation of AlInAsSb digital alloys

91 - Jiyuan Zheng , Sheikh Z. Ahmed , Yuan Yuan 2021

Avalanche photodiodes fabricated from AlInAsSb grown as a digital alloy exhibit low excess noise. In this paper, we investigate the band structure-related mechanisms that influence impact ionization. Band-structures calculated using an empirical tigh t-binding method and Monte Carlo simulations reveal that the mini-gaps in the conduction band do not inhibit electron impact ionization. Good agreement between the full band Monte Carlo simulations and measured noise characteristics is demonstrated.

علم المواد

MT: Multi-Perspective Feature Learning Network for Scene Text Detection

103 - Chuang Yang , Mulin Chen , Yuan Yuan (Senior Member 2021

Text detection, the key technology for understanding scene text, has become an attractive research topic. For detecting various scene texts, researchers propose plenty of detectors with different advantages: detection-based models enjoy fast detectio n speed, and segmentation-based algorithms are not limited by text shapes. However, for most intelligent systems, the detector needs to detect arbitrary-shaped texts with high speed and accuracy simultaneously. Thus, in this study, we design an efficient pipeline named as MT, which can detect adhesive arbitrary-shaped texts with only a single binary mask in the inference stage. This paper presents the contributions on three aspects: (1) a light-weight detection framework is designed to speed up the inference process while keeping high detection accuracy; (2) a multi-perspective feature module is proposed to learn more discriminative representations to segment the mask accurately; (3) a multi-factor constraints IoU minimization loss is introduced for training the proposed model. The effectiveness of MT is evaluated on four real-world scene text datasets, and it surpasses all the state-of-the-art competitors to a large extent.

الرؤية الحاسوبية وتمييز الأنماط

Instance-aware Remote Sensing Image Captioning with Cross-hierarchy Attention

158 - Chengze Wang , Zhiyu Jiang , Yuan Yuan 2021

The spatial attention is a straightforward approach to enhance the performance for remote sensing image captioning. However, conventional spatial attention approaches consider only the attention distribution on one fixed coarse grid, resulting in the semantics of tiny objects can be easily ignored or disturbed during the visual feature extraction. Worse still, the fixed semantic level of conventional spatial attention limits the image understanding in different levels and perspectives, which is critical for tackling the huge diversity in remote sensing images. To address these issues, we propose a remote sensing image caption generator with instance-awareness and cross-hierarchy attention. 1) The instances awareness is achieved by introducing a multi-level feature architecture that contains the visual information of multi-level instance-possible regions and their surroundings. 2) Moreover, based on this multi-level feature extraction, a cross-hierarchy attention mechanism is proposed to prompt the decoder to dynamically focus on different semantic hierarchies and instances at each time step. The experimental results on public datasets demonstrate the superiority of proposed approach over existing methods.

الرؤية الحاسوبية وتمييز الأنماط

Weighted Hierarchical Sparse Representation for Hyperspectral Target Detection

245 - Chenlu Wei , Zhiyu Jiang , Yuan Yuan 2021

Hyperspectral target detection has been widely studied in the field of remote sensing. However, background dictionary building issue and the correlation analysis of target and background dictionary issue have not been well studied. To tackle these is sues, a emph{Weighted Hierarchical Sparse Representation} for hyperspectral target detection is proposed. The main contributions of this work are listed as follows. 1) Considering the insufficient representation of the traditional background dictionary building by dual concentric window structure, a hierarchical background dictionary is built considering the local and global spectral information simultaneously. 2) To reduce the impureness impact of background dictionary, target scores from target dictionary and background dictionary are weighted considered according to the dictionary quality. Three hyperspectral target detection data sets are utilized to verify the effectiveness of the proposed method. And the experimental results show a better performance when compared with the state-of-the-arts.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Open Set Domain Recognition via Attention-Based GCN and Semantic Matching Optimization

188 - Xinxing He , Yuan Yuan , Zhiyu Jiang 2021

Open set domain recognition has got the attention in recent years. The task aims to specifically classify each sample in the practical unlabeled target domain, which consists of all known classes in the manually labeled source domain and target-speci fic unknown categories. The absence of annotated training data or auxiliary attribute information for unknown categories makes this task especially difficult. Moreover, exiting domain discrepancy in label space and data distribution further distracts the knowledge transferred from known classes to unknown classes. To address these issues, this work presents an end-to-end model based on attention-based GCN and semantic matching optimization, which first employs the attention mechanism to enable the central node to learn more discriminating representations from its neighbors in the knowledge graph. Moreover, a coarse-to-fine semantic matching optimization approach is proposed to progressively bridge the domain gap. Experimental results validate that the proposed model not only has superiority on recognizing the images of known and unknown classes, but also can adapt to various openness of the target domain.

الرؤية الحاسوبية وتمييز الأنماط

200 - Zhinan Cai , Zhiyu Jiang , Yuan Yuan 2021

Change detection for remote sensing images is widely applied for urban change detection, disaster assessment and other fields. However, most of the existing CNN-based change detection methods still suffer from the problem of inadequate pseudo-changes suppression and insufficient feature representation. In this work, an unsupervised change detection method based on Task-related Self-supervised Learning Change Detection network with smooth mechanism(TSLCD) is proposed to eliminate it. The main contributions include: (1) the task-related self-supervised learning module is introduced to extract spatial features more effectively. (2) a hard-sample-mining loss function is applied to pay more attention to the hard-to-classify samples. (3) a smooth mechanism is utilized to remove some of pseudo-changes and noise. Experiments on four remote sensing change detection datasets reveal that the proposed TSLCD method achieves the state-of-the-art for change detection task.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Deep feature selection-and-fusion for RGB-D semantic segmentation

106 - Yuejiao Su , Yuan Yuan , Zhiyu Jiang 2021

Scene depth information can help visual information for more accurate semantic segmentation. However, how to effectively integrate multi-modality information into representative features is still an open problem. Most of the existing work uses DCNNs to implicitly fuse multi-modality information. But as the network deepens, some critical distinguishing features may be lost, which reduces the segmentation performance. This work proposes a unified and efficient feature selectionand-fusion network (FSFNet), which contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information. Besides, the network includes a detailed feature propagation module, which is used to maintain low-level detailed information during the forward process of the network. Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed model achieves competitive performance on two public datasets.

الرؤية الحاسوبية وتمييز الأنماط

SRLF: A Stance-aware Reinforcement Learning Framework for Content-based Rumor Detection on Social Media

396 - Chunyuan Yuan , Wanhui Qian , Qianwen Ma 2021

The rapid development of social media changes the lifestyle of people and simultaneously provides an ideal place for publishing and disseminating rumors, which severely exacerbates social panic and triggers a crisis of social trust. Early content-bas ed methods focused on finding clues from the text and user profiles for rumor detection. Recent studies combine the stances of users comments with news content to capture the difference between true and false rumors. Although the users stance is effective for rumor detection, the manual labeling process is time-consuming and labor-intensive, which limits the application of utilizing it to facilitate rumor detection. In this paper, we first finetune a pre-trained BERT model on a small labeled dataset and leverage this model to annotate weak stance labels for users comment data to overcome the problem mentioned above. Then, we propose a novel Stance-aware Reinforcement Learning Framework (SRLF) to select high-quality labeled stance data for model training and rumor detection. Both the stance selection and rumor detection tasks are optimized simultaneously to promote both tasks mutually. We conduct experiments on two commonly used real-world datasets. The experimental results demonstrate that our framework outperforms the state-of-the-art models significantly, which confirms the effectiveness of the proposed framework.

الحساب واللغة الذكاء الاصطناعي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد