Patching as Translation: the Data and the Metaphor

115 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yangruibo Ding

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yangruibo Ding - Baishakhi Ray - Premkumar Devanbu

هندسة البرمجيات التعلم الآلي لغات البرمجة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Machine Learning models from other fields, like Computational Linguistics, have been transplanted to Software Engineering tasks, often quite successfully. Yet a transplanted models initial success at a given task does not necessarily mean it is well-suited for the task. In this work, we examine a common example of this phenomenon: the conceit that software patching is like language translation. We demonstrate empirically that there are subtle, but critical distinctions between sequence-to-sequence models and translation model: while program repair benefits greatly from the former, general modeling architecture, it actually suffers from design decisions built into the latter, both in terms of translation accuracy and diversity. Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions. Our findings also lend strong support to the recent trend towards synthesizing edits of code conditional on the buggy context, to repair bugs. We implement such models ourselves as proof-of-concept tools and empirically confirm that they behave in a fundamentally different, more effective way than the studied translation-based architectures. Overall, our results demonstrate the merit of studying the intricacies of machine learned models in software engineering: not only can this help elucidate potential issues that may be overshadowed by increases in accuracy; it can also help innovate on these models to raise the state-of-the-art further. We will publicly release our replication data and materials at https://github.com/ARiSE-Lab/Patch-as-translation.

قيم البحث

اقرأ أيضاً

SyGuS-Comp 2017: Results and Analysis

99 - Rajeev Alur 2017

Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f that meets both a semantic constraint given by a logical formula phi in a background theory T, and a syntactic constraint given by a grammar G, which specifie s the allowed set of candidate implementations. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on top of SMT-LIB. The Syntax-Guided Synthesis Competition (SyGuS-Comp) is an effort to facilitate, bring together and accelerate research and development of efficient solvers for SyGuS by providing a platform for evaluating different synthesis techniques on a comprehensive set of benchmarks. In this years competition six new solvers competed on over 1500 benchmarks. This paper presents and analyses the results of SyGuS-Comp17.

هندسة البرمجيات التعلم الآلي لغات البرمجة

MACER: A Modular Framework for Accelerated Compilation Error Repair

119 - Darshak Chhatbar , Umair Z. Ahmed , Purushottam Kar 2020

Automated compilation error repair, the problem of suggesting fixes to buggy programs that fail to compile, has generated significant interest in recent years. Apart from being a tool of general convenience, automated code repair has significant peda gogical applications for novice programmers who find compiler error messages cryptic and unhelpful. Existing approaches largely solve this problem using a blackbox-application of a heavy-duty generative learning technique, such as sequence-to-sequence prediction (TRACER) or reinforcement learning (RLAssist). Although convenient, such black-box application of learning techniques makes existing approaches bulky in terms of training time, as well as inefficient at targeting specific error types. We present MACER, a novel technique for accelerated error repair based on a modular segregation of the repair process into repair identification and repair application. MACER uses powerful yet inexpensive discriminative learning techniques such as multi-label classifiers and rankers to first identify the type of repair required and then apply the suggested repair. Experiments indicate that the fine-grained approach adopted by MACER offers not only superior error correction, but also much faster training and prediction. On a benchmark dataset of 4K buggy programs collected from actual student submissions, MACER outperforms existing methods by 20% at suggesting fixes for popular errors that exactly match the fix desired by the student. MACER is also competitive or better than existing methods at all error types -- whether popular or rare. MACER offers a training time speedup of 2x over TRACER and 800x over RLAssist, and a test time speedup of 2-4x over both.

هندسة البرمجيات التعلم الآلي لغات البرمجة

ConE: A Concurrent Edit Detection Tool for Large ScaleSoftware Development

123 - Chandra Maddila , Nachiappan Nagappan , Christian Bird 2021

Modern, complex software systems are being continuously extended and adjusted. The developers responsible for this may come from different teams or organizations, and may be distributed over the world. This may make it difficult to keep track of what other developers are doing, which may result in multiple developers concurrently editing the same code areas. This, in turn, may lead to hard-to-merge changes or even merge conflicts, logical bugs that are difficult to detect, duplication of work, and wasted developer productivity. To address this, we explore the extent of this problem in the pull request based software development model. We study half a year of changes made to six large repositories in Microsoft in which at least 1,000 pull requests are created each month. We find that files concurrently edited in different pull requests are more likely to introduce bugs. Motivated by these findings, we design, implement, and deploy a service named ConE (Concurrent Edit Detector) that proactively detects pull requests containing concurrent edits, to help mitigate the problems caused by them. ConE has been designed to scale, and to minimize false alarms while still flagging relevant concurrently edited files. Key concepts of ConE include the detection of the Extent of Overlap between pull requests, and the identification of Rarely Concurrently Edited Files. To evaluate ConE, we report on its operational deployment on 234 repositories inside Microsoft. ConE assessed 26,000 pull requests and made 775 recommendations about conflicting changes, which were rated as useful in over 70% (554) of the cases. From interviews with 48 users we learned that they believed ConE would save time in conflict resolution and avoiding duplicate work, and that over 90% intend to keep using the service on a daily basis.

هندسة البرمجيات التعلم الآلي لغات البرمجة

Improving type information inferred by decompilers with supervised machine learning

334 - Javier Escalada , Francisco Ortin (1 2021

In software reverse engineering, decompilation is the process of recovering source code from binary files. Decompilers are used when it is necessary to understand or analyze software for which the source code is not available. Although existing decom pilers commonly obtain source code with the same behavior as the binaries, that source code is usually hard to interpret and certainly differs from the original code written by the programmer. Massive codebases could be used to build supervised machine learning models aimed at improving existing decompilers. In this article, we build different classification models capable of inferring the high-level type returned by functions, with significantly higher accuracy than existing decompilers. We automatically instrument C source code to allow the association of binary patterns with their corresponding high-level constructs. A dataset is created with a collection of real open-source applications plus a huge number of synthetic programs. Our system is able to predict function return types with a 79.1% F1-measure, whereas the best decompiler obtains a 30% F1-measure. Moreover, we document the binary patterns used by our classifier to allow their addition in the implementation of existing decompilers.

هندسة البرمجيات التعلم الآلي لغات البرمجة

SpreadsheetCoder: Formula Prediction from Semi-structured Context

115 - Xinyun Chen , Petros Maniatis , Rishabh Singh 2021

Spreadsheet formula prediction has been an important program synthesis problem with many real-world applications. Previous works typically utilize input-output examples as the specification for spreadsheet formula synthesis, where each input-output p air simulates a separate row in the spreadsheet. However, this formulation does not fully capture the rich context in real-world spreadsheets. First, spreadsheet data entries are organized as tables, thus rows and columns are not necessarily independent from each other. In addition, many spreadsheet tables include headers, which provide high-level descriptions of the cell data. However, previous synthesis approaches do not consider headers as part of the specification. In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data. In particular, we propose SpreadsheetCoder, a BERT-based model architecture to represent the tabular context in both row-based and column-based formats. We train our model on a large dataset of spreadsheets, and demonstrate that SpreadsheetCoder achieves top-1 prediction accuracy of 42.51%, which is a considerable improvement over baselines that do not employ rich tabular context. Compared to the rule-based system, SpreadsheetCoder assists 82% more users in composing formulas on Google Sheets.

هندسة البرمجيات التعلم الآلي لغات البرمجة