New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

مقارنة تقنيات تعلم الآلات الإشراف لتحليل النوع في مقالات البحوث الهندسية البرمجية

384 1 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

supervised machine learning machine learning techniques engineering research articles تعلم الآلة تحت الإشراف تقنيات التعلم الآلي المواد البحوث الهندسية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Written communication is of utmost importance to the progress of scientific research. The speed of such development, however, may be affected by the scarcity of reviewers to referee the quality of research articles. In this context, automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity. This paper aims to compare supervised machine learning techniques tested to accomplish genre analysis in Introduction sections of software engineering articles. A semi-supervised approach was carried out to augment the number of annotated sentences in SciSents (Avaliable on: ANONYMOUS). Two supervised approaches using SVM and logistic regression were undertaken to assess the F-score for genre analysis in the corpus. A technique based on logistic regression and BERT has been found to perform genre analysis highly satisfactorily with an average of 88.25 on F-score when retrieving patterns at an overall level.

References used

https://aclanthology.org/

rate research

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

322 - Association for Computation Linguistics 2021 مقالة

Paraphrase generation is a longstanding NLP task that has diverse applications on downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised endeavors have be en proposed to alleviate this issue, they may fail to generate meaningful paraphrases due to the lack of supervision signals. In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with data of weak supervision. Specifically, we tackle the weakly-supervised paraphrase generation problem by: (1) obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion; and (2) developing a meta-learning framework to progressively select valuable samples for fine-tuning a pre-trained language model BART on the sentential paraphrasing task. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.

selectively learn learning to selectively weakly-supervised paraphrase generation تعلم انتقائي توليد إعادة صياغة الإشراف ضعيف صناعة حمض الفوسفور

Genre as Weak Supervision for Cross-lingual Dependency Parsing

384 - Association for Computation Linguistics 2021 مقالة

Recent work has shown that monolingual masked language models learn to represent data-driven notions of language variation which can be used for domain-targeted training data selection. Dataset genre labels are already frequently available, yet remai n largely unexplored in cross-lingual setups. We harness this genre metadata as a weak supervision signal for targeted data selection in zero-shot dependency parsing. Specifically, we project treebank-level genre information to the finer-grained sentence level, with the goal to amplify information implicitly stored in unsupervised contextualized representations. We demonstrate that genre is recoverable from multilingual contextual embeddings and that it provides an effective signal for training data selection in cross-lingual, zero-shot scenarios. For 12 low-resource language treebanks, six of which are test-only, our genre-specific methods significantly outperform competitive baselines as well as recent embedding-based methods for data selection. Moreover, genre-based data selection provides new state-of-the-art results for three of these target languages.

تحديد العمل data selection اختيار البيانات صناعة حمض الفوسفور

Active Learning for Interactive Relation Extraction in a French Newspaper's Articles

226 - Association for Computation Linguistics 2021 مقالة

Relation extraction is a subtask of natural langage processing that has seen many improvements in recent years, with the advent of complex pre-trained architectures. Many of these state-of-the-art approaches are tested against benchmarks with labelle d sentences containing tagged entities, and require important pre-training and fine-tuning on task-specific data. However, in a real use-case scenario such as in a newspaper company mostly dedicated to local information, relations are of varied, highly specific type, with virtually no annotated data for such relations, and many entities co-occur in a sentence without being related. We question the use of supervised state-of-the-art models in such a context, where resources such as time, computing power and human annotators are limited. To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based lightweight model for detecting the relations that do exist, and a state-of-the-art model for relation classification. We compare several choices for classification models in this scenario, from basic word embedding averaging, to graph neural networks and Bert-based ones, as well as several active learning acquisition strategies, in order to find the most cost-efficient yet accurate approach in our French largest daily newspaper company's use case.

interactive relation extraction french newspaper articles newspaper articles استخراج العلاقة التفاعلية مقالات الصحف الفرنسية مقالات الصحف صناعة حمض الفوسفور المزيد..

Agent-Oriented Software Engineering, full development lifecycle

1732 - Damascus University 2010 ورقة بحثية

This research traces, after conducting a wide literature survey, the areas not covered by prominent agent oriented software engineering (AOSE) methodologies. Each methodology has its strength and weakness and focuses on some stages of software devel opment lifecycle but not all stages. This paper presents an addition to a well established AOSE methodology (MaSE). MaSE is considered one of the strongest in the field, it does not, however, support handling early requirements. This work integrates MaSE with another methodology known for its strength in early requirement representation. The integration implied the development of a wide set of translation rules between two different environments of notations and graphical representations. A software tool was developed to automate the translation and a case study is used to demonstrate the work.

software engineering Agents Intelligent Agents SE UML AUML Design Patterns وكلاء الوكلاء الأذكياء هندسة برمجيات نماذج تصميم المزيد..

Learning to Synthesize Data for Semantic Parsing

731 - Association for Computation Linguistics 2021 مقالة

Synthesizing data for semantic parsing has gained increasing attention recently. However, most methods require handcrafted (high-precision) rules in their generative process, hindering the exploration of diverse unseen data. In this work, we propose a generative model which features a (non-neural) PCFG that models the composition of programs (e.g., SQL), and a BART-based translation model that maps a program to an utterance. Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand. Moreover, explicitly modeling compositions using PCFG leads to better exploration of unseen programs, thus generate more diverse data. We evaluate our method in both in-domain and out-of-domain settings of text-to-SQL parsing on the standard benchmarks of GeoQuery and Spider, respectively. Our empirical results show that the synthesized data generated from our model can substantially help a semantic parser achieve better compositional and domain generalization.

learning to synthesize synthesize data تعلم توليفها توليف البيانات صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

مقارنة تقنيات تعلم الآلات الإشراف لتحليل النوع في مقالات البحوث الهندسية البرمجية

Ask ChatGPT about the research

Read More

suggested questions