ترغب بنشر مسار تعليمي؟ اضغط هنا

Adversarial Patch Generation for Automatic Program Repair

74   0   0.0 ( 0 )
 نشر من قبل Abdulaziz Alhefdhi
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Automatic program repair (APR) has seen a growing interest in recent years with numerous techniques proposed. One notable line of research work in APR is search-based techniques which generate repair candidates via syntactic analyses and search for valid repairs in the generated search space. In this work, we explore an alternative approach which is inspired by the adversarial notion of bugs and repairs. Our approach leverages the deep learning Generative Adversarial Networks (GANs) architecture to suggest repairs that are as close as possible to human generated repairs. Preliminary evaluations demonstrate promising results of our approach (generating repairs exactly the same as human fixes for 21.2% of 500 bugs).

قيم البحث

اقرأ أيضاً

We introduce Learn2fix, the first human-in-the-loop, semi-automatic repair technique when no bug oracle--except for the user who is reporting the bug--is available. Our approach negotiates with the user the condition under which the bug is observed. Only when a budget of queries to the user is exhausted, it attempts to repair the bug. A query can be thought of as the following question: When executing this alternative test input, the program produces the following output; is the bug observed? Through systematic queries, Learn2fix trains an automatic bug oracle that becomes increasingly more accurate in predicting the users response. Our key challenge is to maximize the oracles accuracy in predicting which tests are bug-exposing given a small budget of queries. From the alternative tests that were labeled by the user, test-driven automatic repair produces the patch. Our experiments demonstrate that Learn2fix learns a sufficiently accurate automatic oracle with a reasonably low labeling effort (lt. 20 queries). Given Learn2fixs test suite, the GenProg test-driven repair tool produces a higher-quality patch (i.e., passing a larger proportion of validation tests) than using manual test suites provided with the repair benchmark.
Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their se arch space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes. Our evaluation on two widely-used benchmarks shows that CURE correctly fixes 57 Defects4J bugs and 26 QuixBugs bugs, outperforming all existing APR techniques on both benchmarks.
A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by t he oracle, may actually be incorrect. While the state of the art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study the benefit of learning code representations to learn deep features that may encode the properties of patch correctness. Our work mainly investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations. We report on findings based on embeddings produced by pre-trained and re-trained neural networks. Experimental results demonstrate the potential of embeddings to empower learning algorithms in reasoning about patch correctness: a machine learning predictor with BERT transformer-based embeddings associated with logistic regression yielded an AUC value of about 0.8 in predicting patch correctness on a deduplicated dataset of 1000 labeled patches. Our study shows that learned representations can lead to reasonable performance when comparing against the state-of-the-art, PATCH-SIM, which relies on dynamic information. These representations may further be complementary to features that were carefully (manually) engineered in the literature.
Despite significant advances in automatic program repair (APR)techniques over the past decade, practical deployment remains an elusive goal. One of the important challenges in this regard is the general inability of current APR techniques to produce patches that require edits in multiple locations, i.e., multi-hunk patches. In this work, we present a novel APR technique that generalizes single-hunk repair techniques to include an important class of multi-hunk bugs, namely bugs that may require applying a substantially similar patch at a number of locations. We term such sets of repair locations as evolutionary siblings - similar looking code, instantiated in similar contexts, that are expected to undergo similar changes. At the heart of our proposed method is an analysis to accurately identify a set of evolutionary siblings, for a given bug. This analysis leverages three distinct sources of information, namely the test-suite spectrum, a novel code similarity analysis, and the revision history of the project. The discovered siblings are then simultaneously repaired in a similar fashion. We instantiate this technique in a tool called Hercules and demonstrate that it is able to correctly fix 49 bugs in the Defects4J dataset, the highest of any individual APR technique to date. This includes 15 multi-hunk bugs and overall 13 bugs which have not been fixed by any other technique so far.
Relative correctness is the property of a program to be more-correct than another with respect to a given specification. Whereas the traditional definition of (absolute) correctness divides candidate program into two classes (correct, and incorrect), relative correctness arranges candidate programs on the richer structure of a partial ordering. In other venues we discuss the impact of relative correctness on program derivation, and on program verification. In this paper, we discuss the impact of relative correctness on program testing; specifically, we argue that when we remove a fault from a program, we ought to test the new program for relative correctness over the old program, rather than for absolute correctness. We present analytical arguments to support our position, as well as an empirical argument in the form of a small program whose faults are removed in a stepwise manner as its relative correctness rises with each fault removal until we obtain a correct program.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا