Research papers, master and doctoral theses about syntactic

Comparing learnability of two dependency schemes: `semantic' (UD) and `syntactic' (SUD)

915 - Association for Computation Linguistics 2021 مقالة

This paper contributes to the thread of research on the learnability of different dependency annotation schemes: one (semantic') favouring content words as heads of dependency relations and the other (syntactic') favouring syntactic heads. Several st udies have lent support to the idea that choosing syntactic criteria for assigning heads in dependency trees improves the performance of dependency parsers. This may be explained by postulating that syntactic approaches are generally more learnable. In this study, we test this hypothesis by comparing the performance of five parsing systems (both transition- and graph-based) on a selection of 21 treebanks, each in a semantic' variant, represented by standard UD (Universal Dependencies), and a syntactic' variant, represented by SUD (Surface-syntactic Universal Dependencies): unlike previously reported experiments, which considered learnability of semantic' and syntactic' annotations of particular constructions in vitro, the experiments reported here consider whole annotation schemes in vivo. Additionally, we compare these annotation schemes using a range of quantitative syntactic properties, which may also reflect their learnability. The results of the experiments show that SUD tends to be more learnable than UD, but the advantage of one or the other scheme depends on the parser and the corpus in question.

surface-syntactic universal dependencies syntactic تبعيات عالمية السطحية صناعة حمض الفوسفور

Predicting emergent linguistic compositions through time: Syntactic frame extension via multimodal chaining

655 - Association for Computation Linguistics 2021 مقالة

Natural language relies on a finite lexicon to express an unbounded set of emerging ideas. One result of this tension is the formation of new compositions, such that existing linguistic units can be combined with emerging items into novel expressions . We develop a framework that exploits the cognitive mechanisms of chaining and multimodal knowledge to predict emergent compositional expressions through time. We present the syntactic frame extension model (SFEM) that draws on the theory of chaining and knowledge from percept'', concept'', and language'' to infer how verbs extend their frames to form new compositions with existing and novel nouns. We evaluate SFEM rigorously on the 1) modalities of knowledge and 2) categorization models of chaining, in a syntactically parsed English corpus over the past 150 years. We show that multimodal SFEM predicts newly emerged verb syntax and arguments substantially better than competing models using purely linguistic or unimodal knowledge. We find support for an exemplar view of chaining as opposed to a prototype view and reveal how the joint approach of multimodal chaining may be fundamental to the creation of literal and figurative language uses including metaphor and metonymy.

syntactic frame extension predicting emergent linguistic chaining ملحق الإطار النحوي التنبؤ اللغوية الناشئة سحر صناعة حمض الفوسفور المزيد..

AESOP: Paraphrase Generation with Adaptive Syntactic Control

759 - Association for Computation Linguistics 2021 مقالة

We propose to control paraphrase generation through carefully chosen target syntactic structures to generate more proper and higher quality paraphrases. Our model, AESOP, leverages a pretrained language model and adds deliberately chosen syntactical control via a retrieval-based selection module to generate fluent paraphrases. Experiments show that AESOP achieves state-of-the-art performances on semantic preservation and syntactic conformation on two benchmark datasets with ground-truth syntactic control from human-annotated exemplars. Moreover, with the retrieval-based target syntax selection module, AESOP generates paraphrases with even better qualities than the current best model using human-annotated target syntactic parses according to human evaluation. We further demonstrate the effectiveness of AESOP to improve classification models' robustness to syntactic perturbation by data augmentation on two GLUE tasks.

adaptive syntactic control generation with adaptive control paraphrase generation السيطرة على النحوية التكيفية جيل مع التكيف التحكم في إعادة صياغة النص صناعة حمض الفوسفور المزيد..

Simplifying annotation of intersections in time normalization annotation: exploring syntactic and semantic validation

1183 - Association for Computation Linguistics 2021 مقالة

While annotating normalized times in food security documents, we found that the semantically compositional annotation for time normalization (SCATE) scheme required several near-duplicate annotations to get the correct semantics for expressions like Nov. 7th to 11th 2021. To reduce this problem, we explored replacing SCATE's Sub-Interval property with a Super-Interval property, that is, making the smallest units (e.g., 7th and 11th) rather than the largest units (e.g., 2021) the heads of the intersection chains. To ensure that the semantics of annotated time intervals remained unaltered despite our changes to the syntax of the annotation scheme, we applied several different techniques to validate our changes. These validation techniques detected and allowed us to resolve several important bugs in our automated translation from Sub-Interval to Super-Interval syntax.

exploring syntactic time normalization annotation time normalization استكشاف النحوية تطبيع الوقت التطبلق تطبيع الوقت صناعة حمض الفوسفور المزيد..

Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach

652 - Association for Computation Linguistics 2021 مقالة

In this paper, we address unsupervised chunking as a new task of syntactic structure induction, which is helpful for understanding the linguistic structures of human languages as well as processing low-resource languages. We propose a knowledge-trans fer approach that heuristically induces chunk labels from state-of-the-art unsupervised parsing models; a hierarchical recurrent neural network (HRNN) learns from such induced chunk labels to smooth out the noise of the heuristics. Experiments show that our approach largely bridges the gap between supervised and unsupervised chunking.

syntactic structure induction syntactic structure structure induction هيكل النحوية التعريفي هيكل النحوية هيكل التعريفي صناعة حمض الفوسفور المزيد..

A Corpus-based Syntactic Analysis of Two-termed Unlike Coordination

664 - Association for Computation Linguistics 2021 مقالة

Coordination is a phenomenon of language that conjoins two or more terms or phrases using a coordinating conjunction. Although coordination has been explored extensively in the linguistics literature, the rules and constraints that govern its structu re are still largely elusive and widely debated amongst linguists. This paper presents a study of two-termed unlike coordinations in particular, where the two conjuncts of the coordination phrase form valid constituents but have distinct categories. We conducted a syntactic analysis of the phrasal categories that can be conjoined in such unlike coordinations through a computational corpus-based approach, utilizing the Corpus of Contemporary American English (COCA) as the main data source, as well as the Penn Treebank (PTB). The results show that the two conjuncts within unlike coordinations display different properties based on their position, supporting an antisymmetric view of the structure of coordination. This research provides new data and perspectives through the use of statistical techniques that can help shape future theories and models of coordination.

two-termed unlike coordination corpus-based syntactic analysis unlike coordinations على عكس التنسيق على عكس التحليل النحوي القائم على Corpus على عكس التنسيق صناعة حمض الفوسفور المزيد..

On the Relation between Syntactic Divergence and Zero-Shot Performance

530 - Association for Computation Linguistics 2021 مقالة

We explore the link between the extent to which syntactic relations are preserved in translation and the ease of correctly constructing a parse tree in a zero-shot setting. While previous work suggests such a relation, it tends to focus on the macro level and not on the level of individual edges---a gap we aim to address. As a test case, we take the transfer of Universal Dependencies (UD) parsing from English to a diverse set of languages and conduct two sets of experiments. In one, we analyze zero-shot performance based on the extent to which English source edges are preserved in translation. In another, we apply three linguistically motivated transformations to UD, creating more cross-lingually stable versions of it, and assess their zero-shot parsability. In order to compare parsing performance across different schemes, we perform extrinsic evaluation on the downstream task of cross-lingual relation extraction (RE) using a subset of a standard English RE benchmark translated to Russian and Korean. In both sets of experiments, our results suggest a strong relation between cross-lingual stability and zero-shot parsing performance.

syntactic divergence syntactic relations الاختلاف النحوي العلاقات النحوية صناعة حمض الفوسفور

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

793 - Association for Computation Linguistics 2021 مقالة

Recent research has adopted a new experimental field centered around the concept of text perturbations which has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models across many NLP tasks. These findings contradict the common understanding of how the models encode hierarchical and structural information and even question if the word order is modeled with position embeddings. To this end, this paper proposes nine probing datasets organized by the type of controllable text perturbation for three Indo-European languages with a varying degree of word order flexibility: English, Swedish and Russian. Based on the probing analysis of the M-BERT and M-BART models, we report that the syntactic sensitivity depends on the language and model pre-training objectives. We also find that the sensitivity grows across layers together with the increase of the perturbation granularity. Last but not least, we show that the models barely use the positional information to induce syntactic trees from their intermediate self-attention and contextualized representations.

sesame street shaking syntactic trees multilingual probing شارع سمسم تهز الأشجار النحوية التحقيق متعدد اللغات صناعة حمض الفوسفور المزيد..

Frequency Effects on Syntactic Rule Learning in Transformers

894 - Association for Computation Linguistics 2021 مقالة

Pre-trained language models perform well on a variety of linguistic tasks that require symbolic reasoning, raising the question of whether such models implicitly represent abstract symbols and rules. We investigate this question using the case study of BERT's performance on English subject--verb agreement. Unlike prior work, we train multiple instances of BERT from scratch, allowing us to perform a series of controlled interventions at pre-training time. We show that BERT often generalizes well to subject--verb pairs that never occurred in training, suggesting a degree of rule-governed behavior. We also find, however, that performance is heavily influenced by word frequency, with experiments showing that both the absolute frequency of a verb form, as well as the frequency relative to the alternate inflection, are causally implicated in the predictions BERT makes at inference time. Closer analysis of these frequency effects reveals that BERT's behavior is consistent with a system that correctly applies the SVA rule in general but struggles to overcome strong training priors and to estimate agreement features (singular vs. plural) on infrequent lexical items.

syntactic rule learning learning in transformers syntactic rule الحكم النحوي التعلم التعلم في المحولات صناعة حمض الفوسفور

Transformer with Syntactic Position Encoding for Machine Translation

690 - Association for Computation Linguistics 2021 مقالة

It has been widely recognized that syntax information can help end-to-end neural machine translation (NMT) systems to achieve better translation. In order to integrate dependency information into Transformer based NMT, existing approaches either expl oit words' local head-dependent relations, ignoring their non-local neighbors carrying important context; or approximate two words' syntactic relation by their relative distance on the dependency tree, sacrificing exactness. To address these issues, we propose global positional encoding for dependency tree, a new scheme that facilitates syntactic relation modeling between any two words with keeping exactness and without immediate neighbor constraint. Experiment results on NC11 German→English, English→German and WMT English→German datasets show that our approach is more effective than the above two strategies. In addition, our experiments quantitatively show that compared with higher layers, lower layers of the model are more proper places to incorporate syntax information in terms of each layer's preference to the syntactic pattern and the final performance.

syntactic position encoding وضع النحوية ترميز صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد