New community

Subscribe to the gold package and get unlimited access to Shamra Academy

APIRecX: Cross-Library API Recommendation via Pre-Trained Language Model

ApireCX: توصية API المكتبة عبر المكتبة عبر نموذج لغة مدرب مسبقا

376 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

application programming interfaces api recommendation cross-library api recommendation واجهات برمجة التطبيق توصية API. توصية API المتبادلة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

بالنسبة للمبرمجين، تعلم استخدام واجهات برمجة التطبيقات (واجهات برمجة التطبيق) لمكتبة البرمجيات أمرا مهما للغاية. يمكن لأدوات توصية API أن تساعد المطورين في استخدام واجهات برمجة التطبيقات من خلال التوصية باستخدام واجهات برمجة التطبيقات التي سيتم استخدامها بعد ذلك بالنظر إلى واجهات برمجة التطبيقات التي تمت كتابتها. تقليديا، يتم تطبيق نماذج اللغة مثل غرام N على توصية API. ومع ذلك، نظرا لأن مكتبات البرمجيات تبقي المتغيرات والمكتبات الجديدة تبقي الناشئة، فإن واجهات برمجة التطبيقات الجديدة شائعة. يمكن رؤية واجهات برمجة التطبيقات الجديدة هذه مثل كلمات OOV (خارج المفردات) ولا يمكن التعامل معها جيدا من خلال نهج توصية API الحالية بسبب عدم وجود بيانات تدريبية. في هذه الورقة، نقترح ApireCX، أول نهج توصية API للمكتبات، والذي يستخدم BPE لتقسيم كل مكالمة API في كل تسلسل API وقم بتدريب نموذج اللغة GPT. ثم توصي باختصارها عن طريق ضبط النموذج المدرب مسبقا. يمكن ل APIRECX ترحيل معرفة المكتبات الموجودة إلى مكتبة جديدة، ويمكن أن توصي بايس واجهات برمجة التطبيقات التي تعتبرها OOV مسبقا. نقوم بتقييم ApireCX على ست مكتبات وتؤكد النتائج فعاليتها من خلال مقارنة مع نهج توصية API نموذجية.

For programmers, learning the usage of APIs (Application Programming Interfaces) of a software library is important yet difficult. API recommendation tools can help developers use APIs by recommending which APIs to be used next given the APIs that have been written. Traditionally, language models such as N-gram are applied to API recommendation. However, because the software libraries keep changing and new libraries keep emerging, new APIs are common. These new APIs can be seen as OOV (out of vocabulary) words and cannot be handled well by existing API recommendation approaches due to the lack of training data. In this paper, we propose APIRecX, the first cross-library API recommendation approach, which uses BPE to split each API call in each API sequence and pre-trains a GPT based language model. It then recommends APIs by fine-tuning the pre-trained model. APIRecX can migrate the knowledge of existing libraries to a new library, and can recommend APIs that are previously regarded as OOV. We evaluate APIRecX on six libraries and the results confirm its effectiveness by comparing with two typical API recommendation approaches.

References used

https://aclanthology.org/

rate research

Multilingual Translation via Grafting Pre-trained Language Models

372 - Association for Computation Linguistics 2021 مقالة

Can pre-trained BERT for one language and GPT for another be glued together to translate texts? Self-supervised training using only monolingual data has led to the success of pre-trained (masked) language models in many NLP tasks. However, directly c onnecting BERT as an encoder and GPT as a decoder can be challenging in machine translation, for GPT-like models lack a cross-attention component that is needed in seq2seq decoders. In this paper, we propose Graformer to graft separately pre-trained (masked) language models for machine translation. With monolingual data for pre-training and parallel data for grafting training, we maximally take advantage of the usage of both types of data. Experiments on 60 directions show that our method achieves average improvements of 5.8 BLEU in x2en and 2.9 BLEU in en2x directions comparing with the multilingual Transformer of the same size.

توليد رمز المعزز grafting pre-trained language تطعيم اللغة المدربة مسبقا صناعة حمض الفوسفور

Preserving Cross-Linguality of Pre-trained Models via Continual Learning

334 - Association for Computation Linguistics 2021 مقالة

Recently, fine-tuning pre-trained language models (e.g., multilingual BERT) to downstream cross-lingual tasks has shown promising results. However, the fine-tuning process inevitably changes the parameters of the pre-trained model and weakens its cro ss-lingual ability, which leads to sub-optimal performance. To alleviate this problem, we leverage continual learning to preserve the original cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks. The experimental result shows that our fine-tuning methods can better preserve the cross-lingual ability of the pre-trained model in a sentence retrieval task. Our methods also achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.

بيرت القائم على سيامي preserving cross-linguality continual learning الحفاظ على التقاطع التعلم المستمر صناعة حمض الفوسفور

CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model

514 - Association for Computation Linguistics 2021 مقالة

Commit message is a document that summarizes source code changes in natural language. A good commit message clearly shows the source code changes, so this enhances collaboration between developers. Therefore, our work is to develop a model that autom atically writes the commit message. To this end, we release 345K datasets consisting of code modification and commit messages in six programming languages (Python, PHP, Go, Java, JavaScript, and Ruby). Similar to the neural machine translation (NMT) model, using our dataset, we feed the code modification to the encoder input and the commit message to the decoder input and measure the result of the generated commit message with BLEU-4. Also, we propose the following two training methods to improve the result of generating the commit message: (1) A method of preprocessing the input to feed the code modification to the encoder input. (2) A method that uses an initial weight suitable for the code domain to reduce the gap in contextual representation between programming language (PL) and natural language (NL).

commit message generation commit message generation using pre-trained ارتكاب جيل الرسائل ارتكاب رسالة جيل باستخدام مدرب مسبقا صناعة حمض الفوسفور المزيد..

PDALN: Progressive Domain Adaptation over a Pre-trained Model for Low-Resource Cross-Domain Named Entity Recognition

591 - Association for Computation Linguistics 2021 مقالة

Cross-domain Named Entity Recognition (NER) transfers the NER knowledge from high-resource domains to the low-resource target domain. Due to limited labeled resources and domain shift, cross-domain NER is a challenging task. To address these challeng es, we propose a progressive domain adaptation Knowledge Distillation (KD) approach -- PDALN. It achieves superior domain adaptability by employing three components: (1) Adaptive data augmentation techniques, which alleviate cross-domain gap and label sparsity simultaneously; (2) Multi-level Domain invariant features, derived from a multi-grained MMD (Maximum Mean Discrepancy) approach, to enable knowledge transfer across domains; (3) Advanced KD schema, which progressively enables powerful pre-trained language models to perform domain adaptation. Extensive experiments on four benchmarks show that PDALN can effectively adapt high-resource domains to low-resource target domains, even if they are diverse in terms and writing styles. Comparison with other baselines indicates the state-of-the-art performance of PDALN.

نقل crosslingual التعلم cross-domain named entity عبر المجال المسمى كيان صناعة حمض الفوسفور

Improving Cross-Lingual Sentiment Analysis via Conditional Language Adversarial Nets

353 - Association for Computation Linguistics 2021 مقالة

Sentiment analysis has come a long way for high-resource languages due to the availability of large annotated corpora. However, it still suffers from lack of training data for low-resource languages. To tackle this problem, we propose Conditional Lan guage Adversarial Network (CLAN), an end-to-end neural architecture for cross-lingual sentiment analysis without cross-lingual supervision. CLAN differs from prior work in that it allows the adversarial training to be conditioned on both learned features and the sentiment prediction, to increase discriminativity for learned representation in the cross-lingual setting. Experimental results demonstrate that CLAN outperforms previous methods on the multilingual multi-domain Amazon review dataset. Our source code is released at https://github.com/hemanthkandula/clan.

language adversarial nets conditional language adversarial adversarial nets شبكات اللغات العدسات اللغة الشرطية الخصومة شبكات الخصومة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

APIRecX: Cross-Library API Recommendation via Pre-Trained Language Model

ApireCX: توصية API المكتبة عبر المكتبة عبر نموذج لغة مدرب مسبقا

Ask ChatGPT about the research

Read More

suggested questions