أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yu Zhang

131 - Luwenjia Zhou , Yong Shi , Zhi-Yu Zhang 2021

Local metal-poor galaxies are ideal analogues of primordial galaxies with the interstellar medium (ISM) barely being enriched with metals. However, it is unclear whether carbon monoxide remains a good tracer and coolant of molecular gas at low metall icity. Based on the observation with the upgraded Northern Extended Millimeter Array (NOEMA), we report a marginal detection of CO $J$=2-1 emission in IZw18, pushing the detection limit down to $L^prime_{rm CO(2-1)}$=3.99$times$10$^3$ K km$^{-1}$pc$^{-2}$, which is at least 40 times lower than previous studies. As one of the most metal-poor galaxies, IZw18 shows extremely low CO content despite its vigorous star formation activity. Such low CO content relative to its infrared luminosity, star formation rate, and [CII] luminosity, compared with other galaxies, indicates a significant change in the ISM properties at a few percent of the Solar metallicity. In particular, the high [CII] luminosity relative to CO implies a larger molecular reservoir than the CO emitter in IZw18. We also obtain an upper limit of the 1.3mm continuum, which excludes a sub-millimetre excess in IZw18.

الفيزياء الفلكية من المجرات

Anchor DETR: Query Design for Transformer-Based Detector

89 - Yingming Wang , Xiangyu Zhang , Tong Yang 2021

In this paper, we propose a novel query design for the transformer-based detectors. In previous transformer-based detectors, the object queries are a set of learned embeddings. However, each learned embedding does not have an explicit physical meanin g and we can not explain where it will focus on. It is difficult to optimize as the prediction slot of each object query does not have a specific mode. In other words, each object query will not focus on a specific region. To solved these problems, in our query design, object queries are based on anchor points, which are widely used in CNN-based detectors. So each object query focus on the objects near the anchor point. Moreover, our query design can predict multiple objects at one position to solve the difficulty: one region, multiple objects. In addition, we design an attention variant, which can reduce the memory cost while achieving similar or better performance than the standard attention in DETR. Thanks to the query design and the attention variant, the proposed detector that we called Anchor DETR, can achieve better performance and run faster than the DETR with 10$times$ fewer training epochs. For example, it achieves 44.2 AP with 16 FPS on the MSCOCO dataset when using the ResNet50-DC5 feature for training 50 epochs. Extensive experiments on the MSCOCO benchmark prove the effectiveness of the proposed methods. Code is available at https://github.com/megvii-model/AnchorDETR.

الرؤية الحاسوبية وتمييز الأنماط

Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification

295 - Qi Shi , Yu Zhang , Qingyu Yin 2021

Table-based fact verification task aims to verify whether the given statement is supported by the given semi-structured table. Symbolic reasoning with logical operations plays a crucial role in this task. Existing methods leverage programs that conta in rich logical information to enhance the verification process. However, due to the lack of fully supervised signals in the program generation process, spurious programs can be derived and employed, which leads to the inability of the model to catch helpful logical operations. To address the aforementioned problems, in this work, we formulate the table-based fact verification task as an evidence retrieval and reasoning framework, proposing the Logic-level Evidence Retrieval and Graph-based Verification network (LERGV). Specifically, we first retrieve logic-level program-like evidence from the given table and statement as supplementary evidence for the table. After that, we construct a logic-level graph to capture the logical relations between entities and functions in the retrieved evidence, and design a graph-based verification network to perform logic-level graph-based reasoning based on the constructed graph to classify the final entailment relation. Experimental results on the large-scale benchmark TABFACT show the effectiveness of the proposed approach.

الذكاء الاصطناعي الحساب واللغة

YES SIR!Optimizing Semantic Space of Negatives with Self-Involvement Ranker

89 - Ruizhi Pu , Xinyu Zhang , Ruofei Lai 2021

Pre-trained model such as BERT has been proved to be an effective tool for dealing with Information Retrieval (IR) problems. Due to its inspiring performance, it has been widely used to tackle with real-world IR problems such as document ranking. Rec ently, researchers have found that selecting hard rather than random negative samples would be beneficial for fine-tuning pre-trained models on ranking tasks. However, it remains elusive how to leverage hard negative samples in a principled way. To address the aforementioned issues, we propose a fine-tuning strategy for document ranking, namely Self-Involvement Ranker (SIR), to dynamically select hard negative samples to construct high-quality semantic space for training a high-quality ranking model. Specifically, SIR consists of sequential compressors implemented with pre-trained models. Front compressor selects hard negative samples for rear compressor. Moreover, SIR leverages supervisory signal to adaptively adjust semantic space of negative samples. Finally, supervisory signal in rear compressor is computed based on condition probability and thus can control sample dynamic and further enhance the model performance. SIR is a lightweight and general framework for pre-trained models, which simplifies the ranking process in industry practice. We test our proposed solution on MS MARCO with document ranking setting, and the results show that SIR can significantly improve the ranking performance of various pre-trained models. Moreover, our method became the new SOTA model anonymously on MS MARCO Document ranking leaderboard in May 2021.

استرجاع المعلومات الحساب واللغة

Domain Adaptation by Maximizing Population Correlation with Neural Architecture Search

124 - Zhixiong Yue , Pengxin Guo , Yu Zhang 2021

In Domain Adaptation (DA), where the feature distributions of the source and target domains are different, various distance-based methods have been proposed to minimize the discrepancy between the source and target domains to handle the domain shift. In this paper, we propose a new similarity function, which is called Population Correlation (PC), to measure the domain discrepancy for DA. Base on the PC function, we propose a new method called Domain Adaptation by Maximizing Population Correlation (DAMPC) to learn a domain-invariant feature representation for DA. Moreover, most existing DA methods use hand-crafted bottleneck networks, which may limit the capacity and flexibility of the corresponding model. Therefore, we further propose a method called DAMPC with Neural Architecture Search (DAMPC-NAS) to search the optimal network architecture for DAMPC. Experiments on several benchmark datasets, including Office-31, Office-Home, and VisDA-2017, show that the proposed DAMPC-NAS method achieves better results than state-of-the-art DA methods.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

CauseRec: Counterfactual User Sequence Synthesis for Sequential Recommendation

87 - Shengyu Zhang , Dong Yao , Zhou Zhao 2021

Learning user representations based on historical behaviors lies at the core of modern recommender systems. Recent advances in sequential recommenders have convincingly demonstrated high capability in extracting effective user representations from th e given behavior sequences. Despite significant progress, we argue that solely modeling the observational behaviors sequences may end up with a brittle and unstable system due to the noisy and sparse nature of user interactions logged. In this paper, we propose to learn accurate and robust user representations, which are required to be less sensitive to (attack on) noisy behaviors and trust more on the indispensable ones, by modeling counterfactual data distribution. Specifically, given an observed behavior sequence, the proposed CauseRec framework identifies dispensable and indispensable concepts at both the fine-grained item level and the abstract interest level. CauseRec conditionally samples user concept sequences from the counterfactual data distributions by replacing dispensable and indispensable concepts within the original concept sequence. With user representations obtained from the synthesized user sequences, CauseRec performs contrastive user representation learning by contrasting the counterfactual with the observational. We conduct extensive experiments on real-world public recommendation benchmarks and justify the effectiveness of CauseRec with multi-aspects model analysis. The results demonstrate that the proposed CauseRec outperforms state-of-the-art sequential recommenders by learning accurate and robust user representations.

استرجاع المعلومات

Decoding Morphological Evolution of Open Clusters

72 - Qingshun Hu , Yu Zhang , Ali Esamdin 2021

Base on Gaia Second Data Release and the combination of nonparametric bivariate density estimation with the least square ellipse fitting, we derive the shape parameters of the sample clusters. By analyzing the dislocation of the sample clusters, the dislocation $d$ is related to the X-axis pointing toward the Galactic center, Y-axis pointing in the direction of Galactic rotation, and the Z-axis (log(|H|/pc)) that is positive toward the Galactic north pole. This finding underlines the important role of the dislocation of clusters in tracking the external environment of the Milky Way. The orientation ($q_{pm}$) of the clusters with $e_{pm}$ $geq$ 0.4 presents an aggregate distribution in the range of -45$degr$ to 45$degr$, about 74% of them. This probably suggests that these clusters tend to deform heavily in the direction of the Galactic plane. NGC 752 is in a slight stage of expansion in the two-dimensional space and will deform itself morphology along the direction perpendicular to the original stretching direction in the future if no other events occur. The relative degree of deformation of the sample clusters in the short-axis direction decreases as their ages increase. On average, the severely distorted sample clusters in each group account for about 26% $pm$ 9%. This possibly implies a uniform external environment in the range of $|$H$|$ $leq$ 300 pc if the sample completeness of each group is not taken into account.

الفيزياء الفلكية من المجرات الفيزياء الفلكية الشمسية والنجوم

Layer-dependent Optical and Dielectric Properties of Large-size PdSe$_2$ Films Grown by Chemical Vapor Deposition

104 - MingYang Wei , Jie Lian , Yu Zhang 2021

Palladium diselenide (PdSe$_2$), a new type of two-dimensional noble metal dihalides (NMDCs), has received widespread attention for its excellent electrical and optoelectronic properties. Herein, high-quality continuous centimeter-scale PdSe$_2$ film s with layers in the range of 3L-15L were grown using Chemical Vapor Deposition (CVD) method. The absorption spectra and DFT calculations revealed that the bandgap of the PdSe$_2$ films decreased with increasing number of layers, which is due to PdSe$_2$ enhancement of orbital hybridization. Spectroscopic ellipsometry (SE) analysis shows that PdSe2 has significant layer-dependent optical and dielectric properties. This is mainly due to the unique strong exciton effect of the thin PdSe$_2$ film in the UV band. In particular, the effect of temperature on the optical properties of PdSe$_2$ films was also observed, and the thermo-optical coefficients of PdSe$_2$ films with different number of layers were calculated. This study provides fundamental guidance for the fabrication and optimization of PdSe$_2$-based optoelectronic devices.

علم المواد بصريات

LightNER: A Lightweight Generative Framework with Prompt-guided Attention for Low-resource NER

193 - Xiang Chen , Ningyu Zhang , Lei Li 2021

Most existing NER methods rely on extensive labeled data for model training, which struggles in the low-resource scenarios with limited training data. Recently, prompt-tuning methods for pre-trained language models have achieved remarkable performanc e in few-shot learning by exploiting prompts as task guidance to reduce the gap between training progress and downstream tuning. Inspired by prompt learning, we propose a novel lightweight generative framework with prompt-guided attention for low-resource NER (LightNER). Specifically, we construct the semantic-aware answer space of entity categories for prompt learning to generate the entity span sequence and entity categories without any label-specific classifiers. We further propose prompt-guided attention by incorporating continuous prompts into the self-attention layer to re-modulate the attention and adapt pre-trained weights. Note that we only tune those continuous prompts with the whole parameter of the pre-trained language model fixed, thus, making our approach lightweight and flexible for low-resource scenarios and can better transfer knowledge across domains. Experimental results show that LightNER can obtain comparable performance in the standard supervised setting and outperform strong baselines in low-resource settings by tuning only a small part of the parameters.

الحساب واللغة

Multi-Resolution Spatio-Temporal Prediction with Application to Wind Power Generation

143 - Shixiang Zhu , Hanyu Zhang , Yao Xie 2021

This paper proposes a spatio-temporal model for wind speed prediction which can be run at different resolutions. The model assumes that the wind prediction of a cluster is correlated to its upstream influences in recent history, and the correlation b etween clusters is represented by a directed dynamic graph. A Bayesian approach is also described in which prior beliefs about the predictive errors at different data resolutions are represented in a form of Gaussian processes. The joint framework enhances the predictive performance by combining results from predictions at different data resolution and provides reasonable uncertainty quantification. The model is evaluated on actual wind data from the Midwest U.S. and shows a superior performance compared to traditional baselines.

تطبيقات الإحصاء

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد