Recommending Stack Overflow Posts for Fixing Runtime Exceptions using Failure Scenario Matching

104 0 0.0 ( 0 )

Download Cite

Added by Sonal Mahajan

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Sonal Mahajan Fujitsu Laboratories of America

Software Engineering

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Using online Q&A forums, such as Stack Overflow (SO), for guidance to resolve program bugs, among other development issues, is commonplace in modern software development practice. Runtime exceptions (RE) is one such important class of bugs that is actively discussed on SO. In this work we present a technique and prototype tool called MAESTRO that can automatically recommend an SO post that is most relevant to a given Java RE in a developers code. MAESTRO compares the exception-generating program scenario in the developers code with that discussed in an SO post and returns the post with the closest match. To extract and compare the exception scenario effectively, MAESTRO first uses the answer code snippets in a post to implicate a subset of lines in the posts question code snippet as responsible for the exception and then compares these lines with the developers code in terms of their respective Abstract Program Graph (APG) representations. The APG is a simplified and abstracted derivative of an abstract syntax tree, proposed in this work, that allows an effective comparison of the functionality embodied in the high-level program structure, while discarding many of the low-level syntactic or semantic differences. We evaluate MAESTRO on a benchmark of 78 instances of Java REs extracted from the top 500 Java projects on GitHub and show that MAESTRO can return either a highly relevant or somewhat relevant SO post corresponding to the exception instance in 71% of the cases, compared to relevant posts returned in only 8% - 44% instances, by four competitor tools based on state-of-the-art techniques. We also conduct a user experience study of MAESTRO with 10 Java developers, where the participants judge MAESTRO reporting a highly relevant or somewhat relevant post in 80% of the instances. In some cases the post is judged to be even better than the one manually found by the participant.

rate research

Broken External Links on Stack Overflow

413 - Jiakun Liu , Xin Xia , David Lo 2020

Stack Overflow hosts valuable programming-related knowledge with 11,926,354 links that reference to the third-party websites. The links that reference to the resources hosted outside the Stack Overflow websites extend the Stack Overflow knowledge base substantially. However, with the rapid development of programming-related knowledge, many resources hosted on the Internet are not available anymore. Based on our analysis of the Stack Overflow data that was released on Jun. 2, 2019, 14.2% of the links on Stack Overflow are broken links. The broken links on Stack Overflow can obstruct viewers from obtaining desired programming-related knowledge, and potentially damage the reputation of the Stack Overflow as viewers might regard the posts with broken links as obsolete. In this paper, we characterize the broken links on Stack Overflow. 65% of the broken links in our sampled questions are used to show examples, e.g., code examples. 70% of the broken links in our sampled answers are used to provide supporting information, e.g., explaining a certain concept and describing a step to solve a problem. Only 1.67% of the posts with broken links are highlighted as such by viewers in the posts comments. Only 5.8% of the posts with broken links removed the broken links. Viewers cannot fully rely on the vote scores to detect broken links, as broken links are common across posts with different vote scores. The websites that host resources that can be maintained by their users are referenced by broken links the most on Stack Overflow -- a prominent example of such websites is GitHub. The posts and comments related to the web technologies, i.e., JavaScript, HTML, CSS, and jQuery, are associated with more broken links. Based on our findings, we shed lights for future directions and provide recommendations for practitioners and researchers.

Software Engineering

DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons

158 - Dawn Drain , Colin B. Clement , Guillermo Serrato 2021

The joint task of bug localization and program repair is an integral part of the software development process. In this work we present DeepDebug, an approach to automated debugging using large, pretrained transformers. We begin by training a bug-creation model on reversed commit data for the purpose of generating synthetic bugs. We apply these synthetic bugs toward two ends. First, we directly train a backtranslation model on all functions from 200K repositories. Next, we focus on 10K repositories for which we can execute tests, and create bug

Software Engineering Machine Learning

Stack Overflow in Github: Any Snippets There?

74 - Di Yang , Pedro Martins , Vaibhav Saini 2017

When programmers look for how to achieve certain programming tasks, Stack Overflow is a popular destination in search engine results. Over the years, Stack Overflow has accumulated an impressive knowledge base of snippets of code that are amply documented. We are interested in studying how programmers use these snippets of code in their projects. Can we find Stack Overflow snippets in real projects? When snippets are used, is this copy literal or does it suffer adaptations? And are these adaptations specializations required by the idiosyncrasies of the target artifact, or are they motivated by specific requirements of the programmer? The large-scale study presented on this paper analyzes 909k non-fork Python projects hosted on Github, which contain 290M function definitions, and 1.9M Python snippets captured in Stack Overflow. Results are presented as quantitative analysis of block-level code cloning intra and inter Stack Overflow and GitHub, and as an analysis of programming behaviors through the qualitative analysis of our findings.

Software Engineering

Generating Question Titles for Stack Overflow from Mined Code Snippets

91 - Zhipeng Gao , Xin Xia , John Grundy 2020

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it and offer their help. Previous studies have shown that} a significant number of these questions are of low-quality and not attractive to other potential experts in Stack Overflow. These poorly asked questions are less likely to receive useful answers and hinder the overall knowledge generation and sharing process. Considering one of the reasons for introducing low-quality questions in SO is that many developers may not be able to clarify and summarize the key problems behind their presented code snippets due to their lack of knowledge and terminology related to the problem, and/or their poor writing skills, in this study we propose an approach to assist developers in writing high-quality questions by automatically generating question titles for a code snippet using a deep sequence-to-sequence learning approach. Our approach is fully data-driven and uses an attention mechanism to perform better content selection, a copy mechanism to handle the rare-words problem and a coverage mechanism to eliminate word repetition problem. We evaluate our approach on Stack Overflow datasets over a variety of programming languages (e.g., Python, Java, Javascript, C# and SQL) and our experimental results show that our approach significantly outperforms several state-of-the-art baselines in both automatic and human evaluation. We have released our code and datasets to facilitate other researchers to verify their ideas and inspire the follow-up work.

Software Engineering

Mining Architecture Tactics and Quality Attributes Knowledge in Stack Overflow

177 - Tingting Bi , Peng Liang , Antony Tang 2021

Context: Architecture Tactics (ATs) are architectural building blocks that provide general architectural solutions for addressing Quality Attributes (QAs) issues. Mining and analyzing QA-AT knowledge can help the software architecture community better understand architecture design. However, manually capturing and mining this knowledge is labor-intensive and difficult. Objective: Using Stack Overflow (SO) as our source, our main goals are to effectively mine such knowledge; and to have some sense of how developers use ATs with respect to QA concerns from related discussions. Methods: We applied a semi-automatic dictionary-based mining approach to extract the QA-AT posts in SO. With the mined QA-AT posts, we identified the relationships between ATs and QAs. Results: Our approach allow us to mine QA-AT knowledge effectively with an F-measure of 0.865 and Performance of 82.2%. Using this mining approach, we are able to discover architectural synonyms of QAs and ATs used by designers, from which we discover how developers apply ATs to address quality requirements. Conclusions: We make two contributions in this work: First, we demonstrated a semi-automatic approach to mine ATs and QAs from SO posts; Second, we identified little-known design relationships between QAs and ATs and grouped architectural design considerations to aid architects make architecture tactics design decisions.

Software Engineering