ReAssert: Deep Learning for Assert Generation

70 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Robert White Mr

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Robert White - Jens Krinke

هندسة البرمجيات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The automated generation of test code can reduce the time and effort required to build software while increasing its correctness and robustness. In this paper, we present RE-ASSERT, an approach for the automated generation of JUnit test asserts which produces more accurate asserts than previous work with fewer constraints. This is achieved by targeting projects individually, using precise code-to-test traceability for learning and by generating assert statements from the method-under-test directly without the need to write an assert-less test first. We also utilise Reformer, a state-of-the-art deep learning model, along with two models from previous work to evaluate ReAssert and an existing approach, known as ATLAS, using lexical accuracy,uniqueness, and dynamic analysis. Our evaluation of ReAssert shows up to 44% of generated asserts for a single project match exactly with the ground truth, increasing to 51% for generated asserts that compile. We also improve on the ATLAS results through our use of Reformer with 28% of generated asserts matching exactly with the ground truth. Reformer also produces the greatest proportion of unique asserts (71%), giving further evidence that Reformer produces the most useful asserts.

قيم البحث

193 - Cody Watson , Michele Tufano , Kevin Moran 2020

Software testing is an essential part of the software lifecycle and requires a substantial amount of time and effort. It has been estimated that software developers spend close to 50% of their time on testing the code they write. For these reasons, a long standing goal within the research community is to (partially) automate software testing. While several techniques and tools have been proposed to automatically generate test methods, recent work has criticized the quality and usefulness of the assert statements they generate. Therefore, we employ a Neural Machine Translation (NMT) based approach called Atlas(AuTomatic Learning of Assert Statements) to automatically generate meaningful assert statements for test methods. Given a test method and a focal method (i.e.,the main method under test), Atlas can predict a meaningful assert statement to assess the correctness of the focal method. We applied Atlas to thousands of test methods from GitHub projects and it was able to predict the exact assert statement manually written by developers in 31% of the cases when only considering the top-1 predicted assert. When considering the top-5 predicted assert statements, Atlas is able to predict exact matches in 50% of the cases. These promising results hint to the potential usefulness ofour approach as (i) a complement to automatic test case generation techniques, and (ii) a code completion support for developers, whocan benefit from the recommended assert statements while writing test code.

هندسة البرمجيات

Code Generation Based on Deep Learning: a Brief Review

190 - Qihao Zhu , Wenjie Zhang 2021

Automatic software development has been a research hot spot in the field of software engineering (SE) in the past decade. In particular, deep learning (DL) has been applied and achieved a lot of progress in various SE tasks. Among all applications, a utomatic code generation by machines as a general concept, including code completion and code synthesis, is a common expectation in the field of SE, which may greatly reduce the development burden of the software developers and improves the efficiency and quality of the software development process to a certain extent. Code completion is an important part of modern integrated development environments (IDEs). Code completion technology effectively helps programmers complete code class names, method names, and key-words, etc., which improves the efficiency of program development and reduces spelling errors in the coding process. Such tools use static analysis on the code and provide candidates for completion arranged in alphabetical order. Code synthesis is implemented from two aspects, one based on input-output samples and the other based on functionality description. In this study, we introduce existing techniques of these two aspects and the corresponding DL techniques, and present some possible future research directions.

هندسة البرمجيات الذكاء الاصطناعي

Combinatorial Testing for Deep Learning Systems

104 - Lei Ma , Fuyuan Zhang , Minhui Xue 2018

Deep learning (DL) has achieved remarkable progress over the past decade and been widely applied to many safety-critical applications. However, the robustness of DL systems recently receives great concerns, such as adversarial examples against comput er vision systems, which could potentially result in severe consequences. Adopting testing techniques could help to evaluate the robustness of a DL system and therefore detect vulnerabilities at an early stage. The main challenge of testing such systems is that its runtime state space is too large: if we view each neuron as a runtime state for DL, then a DL system often contains massive states, rendering testing each state almost impossible. For traditional software, combinatorial testing (CT) is an effective testing technique to reduce the testing space while obtaining relatively high defect detection abilities. In this paper, we perform an exploratory study of CT on DL systems. We adapt the concept in CT and propose a set of coverage criteria for DL systems, as well as a CT coverage guided test generation technique. Our evaluation demonstrates that CT provides a promising avenue for testing DL systems. We further pose several open questions and interesting directions for combinatorial testing of DL systems.

هندسة البرمجيات الذكاء الاصطناعي التشفير والأمن

Detecting Operational Adversarial Examples for Reliable Deep Learning

135 - Xingyu Zhao , Wei Huang , Sven Schewe 2021

The utilisation of Deep Learning (DL) raises new challenges regarding its dependability in critical applications. Sound verification and validation methods are needed to assure the safe and reliable use of DL. However, state-of-the-art debug testing methods on DL that aim at detecting adversarial examples (AEs) ignore the operational profile, which statistically depicts the softwares future operational use. This may lead to very modest effectiveness on improving the softwares delivered reliability, as the testing budget is likely to be wasted on detecting AEs that are unrealistic or encountered very rarely in real-life operation. In this paper, we first present the novel notion of operational AEs which are AEs that have relatively high chance to be seen in future operation. Then an initial design of a new DL testing method to efficiently detect operational AEs is provided, as well as some insights on our prospective research plan.

هندسة البرمجيات الذكاء الاصطناعي

Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

134 - Michele Tufano , Dawn Drain , Alexey Svyatkovskiy 2020

Unit testing represents the foundational basis of the software testing pyramid, beneath integration and end-to-end testing. Automated software testing researchers have proposed a variety of techniques to assist developers in this time-consuming task. In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements. Our approach is based on a state-of-the-art transformer model initially pretrained on an English textual corpus. This semantically rich model is then trained in a semi-supervised fashion on a large corpus of source code. Finally, we finetune this model on the task of generating assert statements for unit tests. The resulting model is able to generate accurate assert statements for a given method under test. In our empirical evaluation, the model was able to predict the exact assert statements written by developers in 62% of the cases in the first attempt. The results show 80% relative improvement for top-1 accuracy over the previous RNN-based approach in the literature. We also show the substantial impact of the pretraining process on the performances of our model, as well as comparing it with assert auto-completion task. Finally, we demonstrate how our approach can be used to augment EvoSuite test cases, with additional asserts leading to improved test coverage.

هندسة البرمجيات التعلم الآلي