Research papers, master and doctoral theses about order

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little

192 - Association for Computation Linguistics 2021 مقالة

A possible explanation for the impressive performance of masked language model (MLM) pre-training is that such models have learned to represent the syntactic structures prevalent in classical NLP pipelines. In this paper, we propose a different expla nation: MLMs succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics. To demonstrate this, we pre-train MLMs on sentences with randomly shuffled word order, and show that these models still achieve high accuracy after fine-tuning on many downstream tasks---including tasks specifically designed to be challenging for models that ignore word order. Our models perform surprisingly well according to some parametric syntactic probes, indicating possible deficiencies in how we test representations for syntactic information. Overall, our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.

معلومات بايزي المتبادلة word matters pre-training order word matters كلمة الأمور قبل التدريب طلب كلمة الأمور صناعة حمض الفوسفور

LayoutReader: Pre-training of Text and Layout for Reading Order Detection

234 - Association for Computation Linguistics 2021 مقالة

Reading order detection is the cornerstone to understanding visually-rich documents (e.g., receipts and forms). Unfortunately, no existing work took advantage of advanced deep learning models because it is too laborious to annotate a large enough dat aset. We observe that the reading order of WORD documents is embedded in their XML metadata; meanwhile, it is easy to convert WORD documents to PDFs or images. Therefore, in an automated manner, we construct ReadingBank, a benchmark dataset that contains reading order, text, and layout information for 500,000 document images covering a wide spectrum of document types. This first-ever large-scale dataset unleashes the power of deep neural networks for reading order detection. Specifically, our proposed LayoutReader captures the text and layout information for reading order prediction using the seq2seq model. It performs almost perfectly in reading order detection and significantly improves both open-source and commercial OCR engines in ordering text lines in their results in our experiments. The dataset and models are publicly available at https://aka.ms/layoutreader.

reading order detection reading order order detection قراءة طلب القراءة قرار القراءة اكتشاف الطلب صناعة حمض الفوسفور المزيد..

Learning and Analyzing Generation Order for Undirected Sequence Models

229 - Association for Computation Linguistics 2021 مقالة

Undirected neural sequence models have achieved performance competitive with the state-of-the-art directed sequence models that generate monotonically from left to right in machine translation tasks. In this work, we train a policy that learns the ge neration order for a pre-trained, undirected translation model via reinforcement learning. We show that the translations decoded by our learned orders achieve higher BLEU scores than the outputs decoded from left to right or decoded by the learned order from Mansimov et al. (2019) on the WMT'14 German-English translation task. On examples with a maximum source and target length of 30 from De-En and WMT'16 English-Romanian tasks, our learned order outperforms all heuristic generation orders on three out of four language pairs. We next carefully analyze the learned order patterns via qualitative and quantitative analysis. We show that our policy generally follows an outer-to-inner order, predicting the left-most and right-most positions first, and then moving toward the middle while skipping less important words at the beginning. Furthermore, the policy usually predicts positions for a single syntactic constituent structure in consecutive steps. We believe our findings could provide more insights on the mechanism of undirected generation models and encourage further research in this direction.

analyzing generation order sequence models undirected sequence models تحليل النظام الجيل نماذج التسلسل نماذج التسلسل غير الموحد صناعة حمض الفوسفور المزيد..

Leveraging Order-Free Tag Relations for Context-Aware Recommendation

214 - Association for Computation Linguistics 2021 مقالة

Tag recommendation relies on either a ranking function for top-k tags or an autoregressive generation method. However, the previous methods neglect one of two seemingly conflicting yet desirable characteristics of a tag set: orderlessness and inter-d ependency. While the ranking approach fails to address the inter-dependency among tags when they are ranked, the autoregressive approach fails to take orderlessness into account because it is designed to utilize sequential relations among tokens. We propose a sequence-oblivious generation method for tag recommendation, in which the next tag to be generated is independent of the order of the generated tags and the order of the ground truth tags occurring in training data. Empirical results on two different domains, Instagram and Stack Overflow, show that our method is significantly superior to the previous approaches.

leveraging order-free tag order-free tag relations leveraging order-free الاستفادة من النظام علاقات العلامة المجانية للطلب الاستفادة من الطلب صناعة حمض الفوسفور المزيد..

Diagnosing the First-Order Logical Reasoning Ability Through LogicNLI

186 - Association for Computation Linguistics 2021 مقالة

Recently, language models (LMs) have achieved significant performance on many NLU tasks, which has spurred widespread interest for their possible applications in the scientific and social area. However, LMs have faced much criticism of whether they a re truly capable of reasoning in NLU. In this work, we propose a diagnostic method for first-order logic (FOL) reasoning with a new proposed benchmark, LogicNLI. LogicNLI is an NLI-style dataset that effectively disentangles the target FOL reasoning from commonsense inference and can be used to diagnose LMs from four perspectives: accuracy, robustness, generalization, and interpretability. Experiments on BERT, RoBERTa, and XLNet, have uncovered the weaknesses of these LMs on FOL reasoning, which motivates future exploration to enhance the reasoning ability.

logical reasoning ability first-order logical reasoning القدرة المنطقية منطق من الدرجة الأولى المنطقية المنطقية صناعة حمض الفوسفور

On the Evolution of Word Order

139 - Association for Computation Linguistics 2021 مقالة

Most natural languages have a predominant or fixed word order. For example in English the word order is usually Subject-Verb-Object. This work attempts to explain this phenomenon as well as other typological findings regarding word order from a funct ional perspective. In particular, we examine whether fixed word order provides a functional advantage, explaining why these languages are prevalent. To this end, we consider an evolutionary model of language and demonstrate, both theoretically and using genetic algorithms, that a language with a fixed word order is optimal. We also show that adding information to the sentence, such as case markers and noun-verb distinction, reduces the need for fixed word order, in accordance with the typological findings.

word order fixed word order ترتيب كلمة ترتيب كلمة ثابتة صناعة حمض الفوسفور

Investigating Dominant Word Order on Universal Dependencies with Graph Rewriting

180 - Association for Computation Linguistics 2021 مقالة

This paper details experiments we performed on the Universal Dependencies 2.7 corpora in order to investigate the dominant word order in the available languages. For this purpose, we used a graph rewriting tool, GREW, which allowed us to go beyond th e surface annotations and identify the implicit subjects. We first measured the distribution of the six different word orders (SVO, SOV, VSO, VOS, OVS, OSV) in the corpora and investigated when there was a significant difference in the corpora within a given language. Then, we compared the obtained results with information provided in the WALS database (Dryer and Haspelmath, 2013) and in ( ̈Ostling, 2015). Finally, we examined the impact of using a graph rewriting tool for this task. The tools and resources used for this research are all freely available.

dominant word order investigating dominant word universal dependencies كلمة مهيمنة أمر التحقيق في كلمة مهيمنة التبعيات العالمية صناعة حمض الفوسفور المزيد..

Second Order WinoBias (SoWinoBias) Test Set for Latent Gender Bias Detection in Coreference Resolution

123 - Association for Computation Linguistics 2021 مقالة

We observe an instance of gender-induced bias in a downstream application, despite the absence of explicit gender words in the test cases. We provide a test set, SoWinoBias, for the purpose of measuring such latent gender bias in coreference resoluti on systems. We evaluate the performance of current debiasing methods on the SoWinoBias test set, especially in reference to the method's design and altered embedding space properties. See https://github.com/hillary-dawkins/SoWinoBias.

gender bias detection latent gender bias order winobias كشف التحيز بين الجنسين التحيز الجنساني الكامن طلب ينبيا صناعة حمض الفوسفور المزيد..

Domain Divergences: A Survey and Empirical Analysis

178 - Association for Computation Linguistics 2021 مقالة

Domain divergence plays a significant role in estimating the performance of a model in new domains. While there is a significant literature on divergence measures, researchers find it hard to choose an appropriate divergence for a given NLP applicati on. We address this shortcoming by both surveying the literature and through an empirical study. We develop a taxonomy of divergence measures consisting of three classes --- Information-theoretic, Geometric, and Higher-order measures and identify the relationships between them. Further, to understand the common use-cases of these measures, we recognise three novel applications -- 1) Data Selection, 2) Learning Representation, and 3) Decisions in the Wild -- and use it to organise our literature. From this, we identify that Information-theoretic measures are prevalent for 1) and 3), and Higher-order measures are more common for 2). To further help researchers choose appropriate measures to predict drop in performance -- an important aspect of Decisions in the Wild, we perform correlation analysis spanning 130 domain adaptation scenarios, 3 varied NLP tasks and 12 divergence measures identified from our survey. To calculate these divergences, we consider the current contextual word representations (CWR) and contrast with the older distributed representations. We find that traditional measures over word distributions still serve as strong baselines, while higher-order measures with CWR are effective.

measures higher-order measures divergence تدابير ذات طلب أعلى تشعب صناعة حمض الفوسفور

Variance-reduced First-order Meta-learning for Natural Language Processing Tasks

239 - Association for Computation Linguistics 2021 مقالة

First-order meta-learning algorithms have been widely used in practice to learn initial model parameters that can be quickly adapted to new tasks due to their efficiency and effectiveness. However, existing studies find that meta-learner can overfit to some specific adaptation when we have heterogeneous tasks, leading to significantly degraded performance. In Natural Language Processing (NLP) applications, datasets are often diverse and each task has its unique characteristics. Therefore, to address the overfitting issue when applying first-order meta-learning to NLP applications, we propose to reduce the variance of the gradient estimator used in task adaptation. To this end, we develop a variance-reduced first-order meta-learning algorithm. The core of our algorithm is to introduce a novel variance reduction term to the gradient estimation when performing the task adaptation. Experiments on two NLP applications: few-shot text classification and multi-domain dialog state tracking demonstrate the superior performance of our proposed method.

الشبكة الربط language processing tasks first-order meta-learning مهام معالجة اللغة من الدرجة الأولى التعلم التعلم صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد