أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Xiangci Li

A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification

96 - Xiangci Li , Gully Burns , Nanyun Peng 2020

Even for domain experts, it is a non-trivial task to verify a scientific claim by providing supporting or refuting evidence rationales. The situation worsens as misinformation is proliferated on social media or news websites, manually or programmatic ally, at every moment. As a result, an automatic fact-verification tool becomes crucial for combating the spread of misinformation. In this work, we propose a novel, paragraph-level, multi-task learning model for the SciFact task by directly computing a sequence of contextualized sentence embeddings from a BERT model and jointly training the model on rationale selection and stance prediction.

الحساب واللغة

Context-aware Stand-alone Neural Spelling Correction

101 - Xiangci Li , Hairong Liu , Liang Huang 2020

Existing natural language processing systems are vulnerable to noisy inputs resulting from misspellings. On the contrary, humans can easily infer the corresponding correct words from their misspellings and surrounding context. Inspired by this, we ad dress the stand-alone spelling correction problem, which only corrects the spelling of each token without additional token insertion or deletion, by utilizing both spelling information and global context representations. We present a simple yet powerful solution that jointly detects and corrects misspellings as a sequence labeling task by fine-turning a pre-trained language model. Our solution outperforms the previous state-of-the-art result by 12.8% absolute F0.5 score.

الحساب واللغة

Scientific Discourse Tagging for Evidence Extraction

102 - Xiangci Li , Gully Burns , Nanyun Peng 2019

Evidence plays a crucial role in any biomedical research narrative, providing justification for some claims and refutation for others. We seek to build models of scientific argument using information extraction methods from full-text papers. We prese nt the capability of automatically extracting text fragments from primary research papers that describe the evidence presented in that papers figures, which arguably provides the raw material of any scientific argument made within the paper. We apply richly contextualized deep representation learning pre-trained on biomedical domain corpus to the analysis of scientific discourse structures and the extraction of evidence fragments (i.e., the text in the results section describing data presented in a specified subfigure) from a set of biomedical experimental research articles. We first demonstrate our state-of-the-art scientific discourse tagger on two scientific discourse tagging datasets and its transferability to new datasets. We then show the benefit of leveraging scientific discourse tags for downstream tasks such as claim-extraction and evidence fragment detection. Our work demonstrates the potential of using evidence fragments derived from figure spans for improving the quality of scientific claims by cataloging, indexing and reusing evidence fragments as independent documents.

الحساب واللغة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد