An empirical evaluation of the usefulness of Tree Kernels for Commit-time Defect Detection in large software systems

223 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hareem Sahar

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Hareem Sahar - Yuxin Liu - Abram Hindle

هندسة البرمجيات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Defect detection at commit check-in time prevents the introduction of defects into software systems. Current defect detection approaches rely on metric-based models which are not very accurate and whose results are not directly useful for developers. We propose a method to detect bug-inducing commits by comparing the incoming changes with all past commits in the project, considering both those that introduced defects and those that did not. Our method considers individual changes in the commit separately, at the method-level granularity. Doing so helps developers as they are informed of specific methods that need further attention instead of being told that the entire commit is problematic. Our approach represents source code as abstract syntax trees and uses tree kernels to estimate the similarity of the code with previous commits. We experiment with subtree kernels (STK), subset tree kernels (SSTK), or partial tree kernels (PTK). An incoming change is then classified using a K-NN classifier on the past changes. We evaluate our approach on the BigCloneBench benchmark and on the Technical Debt dataset, using the NiCad clone detector as the baseline. Our experiments with the BigCloneBench benchmark show that the tree kernel approach can detect clones with a comparable MAP to that of NiCad. Also, on defect detection with the Technical Debt dataset, tree kernels are least as effective as NiCad with MRR, F-score, and Accuracy of 0.87, 0.80, and 0.82 respectively.

قيم البحث

66 - Wei Tao , Yanlin Wang , Ensheng Shi 2021

Commit messages are natural language descriptions of code changes, which are important for program understanding and maintenance. However, writing commit messages manually is time-consuming and laborious, especially when the code is updated frequentl y. Various approaches utilizing generation or retrieval techniques have been proposed to automatically generate commit messages. To achieve a better understanding of how the existing approaches perform in solving this problem, this paper conducts a systematic and in-depth analysis of the state-of-the-art models and datasets. We find that: (1) Different variants of the BLEU metric are used in previous works, which affects the evaluation and understanding of existing methods. (2) Most existing datasets are crawled only from Java repositories while repositories in other programming languages are not sufficiently explored. (3) Dataset splitting strategies can influence the performance of existing models by a large margin. Some models show better performance when the datasets are split by commit, while other models perform better when the datasets are split by timestamp or by project. Based on our findings, we conduct a human evaluation and find the BLEU metric that best correlates with the human scores for the task. We also collect a large-scale, information-rich, and multi-language commit message dataset MCMD and evaluate existing models on this dataset. Furthermore, we conduct extensive experiments under different dataset splitting strategies and suggest the suitable models under different scenarios. Based on the experimental results and findings, we provide feasible suggestions for comprehensively evaluating commit message generation models and discuss possible future research directions. We believe this work can help practitioners and researchers better evaluate and select models for automatic commit message generation.

هندسة البرمجيات الذكاء الاصطناعي

Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems

146 - Derek Messie 2005

This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight self-optimizing agents embedded within Level 1 of the prototype are re sponsible for proactive and reactive monitoring and mitigation based on specified layers of competence. The agents are self-protecting, detecting cascading failures using a distributed approach. Adaptive, reconfigurable, and mobile objects for reliablility are designed to be self-configuring to adapt automatically to dynamically changing environments. These objects provide a self-healing layer with the ability to discover, diagnose, and react to discontinuities in real-time processing. A generic modeling environment was developed to facilitate design and implementation of hardware resource specifications, application data flow, and failure mitigation strategies. Level 1 of the planned BTeV trigger system alone will consist of 2500 DSPs, so the number of components and intractable fault scenarios involved make it impossible to design an `expert system that applies traditional centralized mitigative strategies based on rules capturing every possible system state. Instead, a distributed reactive approach is implemented using the tools and methodologies developed by the Real-Time Embedded Systems group.

هندسة البرمجيات

An Empirical Study of Software Exceptions in the Field using Search Logs

353 - Foyzul Hassan , Chetan Bansal , Nachiappan Nagappan 2020

Software engineers spend a substantial amount of time using Web search to accomplish software engineering tasks. Such search tasks include finding code snippets, API documentation, seeking help with debugging, etc. While debugging a bug or crash, one of the common practices of software engineers is to search for information about the associated error or exception traces on the internet. In this paper, we analyze query logs from a leading commercial general-purpose search engine (GPSE) such as Google, Yahoo! or Bing to carry out a large scale study of software exceptions. To the best of our knowledge, this is the first large scale study to analyze how Web search is used to find information about exceptions. We analyzed about 1 million exception related search queries from a random sample of 5 billion web search queries. To extract exceptions from unstructured query text, we built a novel and high-performance machine learning model with a F1-score of 0.82. Using the machine learning model, we extracted exceptions from raw queries and performed popularity, effort, success, query characteristic and web domain analysis. We also performed programming language-specific analysis to give a better view of the exception search behavior. These techniques can help improve existing methods, documentation and tools for exception analysis and prediction. Further, similar techniques can be applied for APIs, frameworks, etc.

هندسة البرمجيات استرجاع المعلومات

An Empirical Evaluation of GDPR Compliance Violations in Android mHealth Apps

72 - Ming Fan , Le Yu , Sen Chen 2020

The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named mytool, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on mytool, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.

هندسة البرمجيات التشفير والأمن

Enablers and Impediments for Collaborative Research in Software Testing: An Empirical Exploration

158 - Eduard Paul Enoiu , Adnan Causevic 2014

When it comes to industrial organizations, current collaboration efforts in software engineering research are very often kept in-house, depriving these organizations off the skills necessary to build independent collaborative research. The current tr end, towards empirical software engineering research, requires certain standards to be established which would guide these collaborative efforts in creating a strong partnership that promotes independent, evidence-based, software engineering research. This paper examines key enabling factors for an efficient and effective industry-academia collaboration in the software testing domain. A major finding of the research was that while technology is a strong enabler to better collaboration, it must be complemented with industrial openness to disclose research results and the use of a dedicated tooling platform. We use as an example an automated test generation approach that has been developed in the last two years collaboratively with Bombardier Transportation AB in Sweden.

هندسة البرمجيات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الدولية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

An empirical evaluation of the usefulness of Tree Kernels for Commit-time Defect Detection in large software systems

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً