Holistic Combination of Structural and Textual Code Information for Context based API Recommendation

106 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chi Chen

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chi Chen - Xin Peng - Zhenchang Xing

هندسة البرمجيات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Context based API recommendation is an important way to help developers find the needed APIs effectively and efficiently. For effective API recommendation, we need not only a joint view of both structural and textual code information, but also a holistic view of correlated API usage in control and data flow graph as a whole. Unfortunately, existing API recommendation methods exploit structural or textual code information separately. In this work, we propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information). APIRec-CST is a deep learning model that combines the API usage with the text information in the source code based on an API Context Graph Network and a Code Token Network that simultaneously learn structural and textual features for API recommendation. We apply APIRec-CST to train a model for JDK library based on 1,914 open-source Java projects and evaluate the accuracy and MRR (Mean Reciprocal Rank) of API recommendation with another 6 open-source projects. The results show that our approach achieves respectively a top-1, top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach and a tree-based deep learning approach for API recommendation. A further analysis shows that textual code information makes sense and improves the accuracy and MRR. We also conduct a user study in which two groups of students are asked to finish 6 programming tasks with or without our APIRec-CST plugin. The results show that APIRec-CST can help the students to finish the tasks faster and more accurately and the feedback on the usability is overwhelmingly positive.

قيم البحث

159 - Chen Lyu , Ruyun Wang , Hongyu Zhang 2021

The problem of code generation from textual program descriptions has long been viewed as a grand challenge in software engineering. In recent years, many deep learning based approaches have been proposed, which can generate a sequence of code from a sequence of textual program description. However, the existing approaches ignore the global relationships among API methods, which are important for understanding the usage of APIs. In this paper, we propose to model the dependencies among API methods as an API dependency graph (ADG) and incorporate the graph embedding into a sequence-to-sequence (Seq2Seq) model. In addition to the existing encoder-decoder structure, a new module named ``embedder is introduced. In this way, the decoder can utilize both global structural dependencies and textual program description to predict the target code. We conduct extensive code generation experiments on three public datasets and in two programming languages (Python and Java). Our proposed approach, called ADG-Seq2Seq, yields significant improvements over existing state-of-the-art methods and maintains its performance as the length of the target code increases. Extensive ablation tests show that the proposed ADG embedding is effective and outperforms the baselines.

هندسة البرمجيات الذكاء الاصطناعي

deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search

256 - Chen Zeng , Yue Yu , Shanshan Li 2021

With the rapid increase in the amount of public code repositories, developers maintain a great desire to retrieve precise code snippets by using natural language. Despite existing deep learning based approaches(e.g., DeepCS and MMAN) have provided th e end-to-end solutions (i.e., accepts natural language as queries and shows related code fragments retrieved directly from code corpus), the accuracy of code search in the large-scale repositories is still limited by the code representation (e.g., AST) and modeling (e.g., directly fusing the features in the attention stage). In this paper, we propose a novel learnable deep Graph for Code Search (calleddeGraphCS), to transfer source code into variable-based flow graphs based on the intermediate representation technique, which can model code semantics more precisely compared to process the code as text directly or use the syntactic tree representation. Furthermore, we propose a well-designed graph optimization mechanism to refine the code representation, and apply an improved gated graph neural network to model variable-based flow graphs. To evaluate the effectiveness of deGraphCS, we collect a large-scale dataset from GitHub containing 41,152 code snippets written in C language, and reproduce several typical deep code search methods for comparison. Besides, we design a qualitative user study to verify the practical value of our approach. The experimental results have shown that deGraphCS can achieve state-of-the-art performances, and accurately retrieve code snippets satisfying the needs of the users.

هندسة البرمجيات الذكاء الاصطناعي

Model-based Testing of the Java Network API

107 - Cyrille Artho 2017

Testing networked systems is challenging. The client or server side cannot be tested by itself. We present a solution using tool Modbat that generates test cases for Javas network library java.nio, where we test both blocking and non-blocking network functions. Our test model can dynamically simulate actions in multiple worker and client threads, thanks to a carefully orchestrated design that covers non-determinism while ensuring progress.

هندسة البرمجيات

Recommendation of Exception Handling Code in Mobile App Development

109 - Tam The Nguyen , Phong Minh Vu , Tung Thanh Nguyen 2019

In modern programming languages, exception handling is an effective mechanism to avoid unexpected runtime errors. Thus, failing to catch and handle exceptions could lead to serious issues like system crashing, resource leaking, or negative end-user e xperiences. However, writing correct exception handling code is often challenging in mobile app development due to the fast-changing nature of API libraries for mobile apps and the insufficiency of their documentation and source code examples. Our prior study shows that in practice mobile app developers cause many exception-related bugs and still use bad exception handling practices (e.g. catch an exception and do nothing). To address such problems, in this paper, we introduce two novel techniques for recommending correct exception handling code. One technique, XRank, recommends code to catch an exception likely occurring in a code snippet. The other, XHand, recommends correction code for such an occurring exception. We have developed ExAssist, a code recommendation tool for exception handling using XRank and XHand. The empirical evaluation shows that our techniques are highly effective. For example, XRank has top-1 accuracy of 70% and top-3 accuracy of 87%. XHands results are 89% and 96%, respectively.

هندسة البرمجيات

Code Generation Based on Deep Learning: a Brief Review

190 - Qihao Zhu , Wenjie Zhang 2021

Automatic software development has been a research hot spot in the field of software engineering (SE) in the past decade. In particular, deep learning (DL) has been applied and achieved a lot of progress in various SE tasks. Among all applications, a utomatic code generation by machines as a general concept, including code completion and code synthesis, is a common expectation in the field of SE, which may greatly reduce the development burden of the software developers and improves the efficiency and quality of the software development process to a certain extent. Code completion is an important part of modern integrated development environments (IDEs). Code completion technology effectively helps programmers complete code class names, method names, and key-words, etc., which improves the efficiency of program development and reduces spelling errors in the coding process. Such tools use static analysis on the code and provide candidates for completion arranged in alphabetical order. Code synthesis is implemented from two aspects, one based on input-output samples and the other based on functionality description. In this study, we introduce existing techniques of these two aspects and the corresponding DL techniques, and present some possible future research directions.

هندسة البرمجيات الذكاء الاصطناعي