New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Retrieval Augmented Code Generation and Summarization

استرجاع توليد التعليمات البرمجية المعزز

628 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

تحسين المنطق العددي code generation augmented code generation رمز الجيل توليد رمز المعزز صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models. REDCODER has a couple of uniqueness. First, it extends the state-of-the-art dense retrieval technique to search for relevant code or summaries. Second, it can work with retrieval databases that include unimodal (only code or natural language description) or bimodal instances (code-description pairs). We conduct experiments and extensive analysis on two benchmark datasets of code generation and summarization in Java and Python, and the promising results endorse the effectiveness of our proposed retrieval augmented framework.

References used

https://aclanthology.org/

rate research

Robust Retrieval Augmented Generation for Zero-shot Slot Filling

417 - Association for Computation Linguistics 2021 مقالة

Automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in AI. One way to make headway for this problem is through advancements in a related task known as slot filling. In this tas k, given an entity query in form of [Entity, Slot, ?], a system is asked to fill' the slot by generating or extracting the missing value exploiting evidence extracted from relevant passage(s) in the given document collection. The recent works in the field try to solve this task in an end-to-end fashion using retrieval-based language models. In this paper, we present a novel approach to zero-shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models. Our model reports large improvements on both T-REx and zsRE slot filling datasets, improving both passage retrieval and slot value generation, and ranking at the top-1 position in the KILT leaderboard. Moreover, we demonstrate the robustness of our system showing its domain adaptation capability on a new variant of the TACRED dataset for slot filling, through a combination of zero/few-shot learning. We release the source code and pre-trained models.

نهج السيارات الآلي المتغير zero-shot slot filling retrieval augmented generation ملء فتحة صفر النار استرجاع الجيل المعزز صناعة حمض الفوسفور

Structure-Augmented Keyphrase Generation

215 - Association for Computation Linguistics 2021 مقالة

This paper studies the keyphrase generation (KG) task for scenarios where structure plays an important role. For example, a scientific publication consists of a short title and a long body, where the title can be used for de-emphasizing unimportant d etails in the body. Similarly, for short social media posts (, tweets), scarce context can be augmented from titles, though often missing. Our contribution is generating/augmenting structure then injecting these information in the encoding, using existing keyphrases of other documents, complementing missing/incomplete titles. We propose novel structure-augmented document encoding approaches that consist of the following two phases: The first phase, generating structure, extends the given document with related but absent keyphrases, augmenting missing context. The second phase, encoding structure, builds a graph of keyphrases and the given document to obtain the structure-aware representation of the augmented text. Our empirical results validate that our proposed structure augmentation and augmentation-aware encoding/decoding can improve KG for both scenarios, outperforming the state-of-the-art.

استخراج ثلاثي structure-augmented keyphrase generation توليد تسخير الهيكل المعزز صناعة حمض الفوسفور

HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text

424 - Association for Computation Linguistics 2021 مقالة

Text generation is a highly active area of research in the computational linguistic community. The evaluation of the generated text is a challenging task and multiple theories and metrics have been proposed over the years. Unfortunately, text generat ion and evaluation are relatively understudied due to the scarcity of high-quality resources in code-mixed languages where the words and phrases from multiple languages are mixed in a single utterance of text and speech. To address this challenge, we present a corpus (HinGE) for a widely popular code-mixed language Hinglish (code-mixing of Hindi and English languages). HinGE has Hinglish sentences generated by humans as well as two rule-based algorithms corresponding to the parallel Hindi-English sentences. In addition, we demonstrate the in- efficacy of widely-used evaluation metrics on the code-mixed data. The HinGE dataset will facilitate the progress of natural language generation research in code-mixed languages.

المتغيرات المستهدفة code-mixed hinglish text النص مختلط النص Hinglish صناعة حمض الفوسفور

Knowledge and Keywords Augmented Abstractive Sentence Summarization

422 - Association for Computation Linguistics 2021 مقالة

In this paper, we study the abstractive sentence summarization. There are two essential information features that can influence the quality of news summarization, which are topic keywords and the knowledge structure of the news text. Besides, the exi sting knowledge encoder has poor performance on sparse sentence knowledge structure. Considering these, we propose KAS, a novel Knowledge and Keywords Augmented Abstractive Sentence Summarization framework. Tri-encoders are utilized to integrate contexts of original text, knowledge structure and keywords topic simultaneously, with a special linearized knowledge structure. Automatic and human evaluations demonstrate that KAS achieves the best performances.

abstractive sentence summarization augmented abstractive sentence keywords augmented abstractive تلخيص الجملة الجماعية الجملة المبادرة المعزز الكلمات الرئيسية المعزز المبادرة صناعة حمض الفوسفور المزيد..

Comparing Grammatical Theories of Code-Mixing

315 - Association for Computation Linguistics 2021 مقالة

Code-mixed text generation systems have found applications in many downstream tasks, including speech recognition, translation and dialogue. A paradigm of these generation systems relies on well-defined grammatical theories of code-mixing, and there is a lack of comparison of these theories. We present a large-scale human evaluation of two popular grammatical theories, Matrix-Embedded Language (ML) and Equivalence Constraint (EC). We compare them against three heuristic-based models and quantitatively demonstrate the effectiveness of the two grammatical theories.

comparing grammatical theories grammatical theories comparing grammatical مقارنة النظريات النحوية النظريات النحوية مقارنة النحوية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Retrieval Augmented Code Generation and Summarization

استرجاع توليد التعليمات البرمجية المعزز

Ask ChatGPT about the research

Read More

suggested questions