Systematic Literature Review of Validation Methods for AI Systems

208 0 0.0 ( 0 )

Download Cite

Added by Lalli Myllyaho

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Lalli Myllyaho - Mikko Raatikainen - Tomi Mannisto

Software Engineering

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Context: Artificial intelligence (AI) has made its way into everyday activities, particularly through new techniques such as machine learning (ML). These techniques are implementable with little domain knowledge. This, combined with the difficulty of testing AI systems with traditional methods, has made system trustworthiness a pressing issue. Objective: This paper studies the methods used to validate practical AI systems reported in the literature. Our goal is to classify and describe the methods that are used in realistic settings to ensure the dependability of AI systems. Method: A systematic literature review resulted in 90 papers. Systems presented in the papers were analysed based on their domain, task, complexity, and applied validation methods. Results: The validation methods were synthesized into a taxonomy consisting of trial, simulation, model-centred validation, and expert opinion. Failure monitors, safety channels, redundancy, voting, and input and output restrictions are methods used to continuously validate the systems after deployment. Conclusions: Our results clarify existing strategies applied to validation. They form a basis for the synthesization, assessment, and refinement of AI system validation in research and guidelines for validating individual systems in practice. While various validation strategies have all been relatively widely applied, only few studies report on continuous validation. Keywords: artificial intelligence, machine learning, validation, testing, V&V, systematic literature review.

rate research

A Systematic Literature Review on Blockchain Governance

102 - Yue Liu , Qinghua Lu , Liming Zhu 2021

Blockchain has been increasingly used as a software component to enable decentralisation in software architecture for a variety of applications. Blockchain governance has received considerable attention to ensure the safe and appropriate use and evolution of blockchain, especially after the Ethereum DAO attack in 2016. To understand the state-of-the-art of blockchain governance and provide an actionable guidance for academia and practitioners, in this paper, we conduct a systematic literature review, identifying 34 primary studies. Our study comprehensively investigates blockchain governance via 5W1H questions. The study results reveal several major findings: 1) the adaptation and upgrade of blockchain are the primary purposes of blockchain governance, while both software quality attributes and human value attributes need to be increasingly considered; 2) blockchain governance mainly relies on the project team, node operators, and users of a blockchain platform; and 3) existing governance solutions can be classified into process mechanisms and product mechanisms, which mainly focus on the operation phase over the blockchain platform layer.

Software Engineering

Software Development Analytics in Practice: A Systematic Literature Review

191 - Joao Caldeira , Fernando Brito e Abreu , Jorge Cardoso 2020

Context:Software Development Analytics is a research area concerned with providing insights to improve product deliveries and processes. Many types of studies, data sources and mining methods have been used for that purpose. Objective:This systematic literature review aims at providing an aggregate view of the relevant studies on Software Development Analytics in the past decade (2010-2019), with an emphasis on its application in practical settings. Method:Definition and execution of a search string upon several digital libraries, followed by a quality assessment criteria to identify the most relevant papers. On those, we extracted a set of characteristics (study type, data source, study perspective, development life-cycle activities covered, stakeholders, mining methods, and analytics scope) and classified their impact against a taxonomy. Results:Source code repositories, experimental case studies, and developers are the most common data sources, study types, and stakeholders, respectively. Product and project managers are also often present, but less than expected. Mining methods are evolving rapidly and that is reflected in the long list identified. Descriptive statistics are the most usual method followed by correlation analysis. Being software development an important process in every organization, it was unexpected to find that process mining was present in only one study. Most contributions to the software development life cycle were given in the quality dimension. Time management and costs control were lightly debated. The analysis of security aspects suggests it is an increasing topic of concern for practitioners. Risk management contributions are scarce. Conclusions:There is a wide improvement margin for software development analytics in practice. For instance, mining and analyzing the activities performed by software developers in their actual workbench, the IDE.

Software Engineering

Software Testing Process Models Benefits & Drawbacks: a Systematic Literature Review

188 - Katarina Hrabovska , Bruno Rossi , Tomav{s} Pitner 2019

Context: Software testing plays an essential role in product quality improvement. For this reason, several software testing models have been developed to support organizations. However, adoption of testing process models inside organizations is still sporadic, with a need for more evidence about reported experiences. Aim: Our goal is to identify results gathered from the application of software testing models in organizational contexts. We focus on characteristics such as the context of use, practices applied in different testing process phases, and reported benefits & drawbacks. Method: We performed a Systematic Literature Review (SLR) focused on studies about the application of software testing processes, complemented by results from previous reviews. Results: From 35 primary studies and survey-based articles, we collected 17 testing models. Although most of the existing models are described as applicable to general contexts, the evidence obtained from the studies shows that some models are not suitable for all enterprise sizes, and inadequate for specific domains. Conclusion: The SLR evidence can serve to compare different software testing models for applicability inside organizations. Both benefits and drawbacks, as reported in the surveyed cases, allow getting a better view of the strengths and weaknesses of each model.

Software Engineering

Ethics of AI: A Systematic Literature Review of Principles and Challenges

307 - Arif Ali Khan , Sher Badshah , Peng Liang 2021

Ethics in AI becomes a global topic of interest for both policymakers and academic researchers. In the last few years, various research organizations, lawyers, think tankers and regulatory bodies get involved in developing AI ethics guidelines and principles. However, there is still debate about the implications of these principles. We conducted a systematic literature review (SLR) study to investigate the agreement on the significance of AI principles and identify the challenging factors that could negatively impact the adoption of AI ethics principles. The results reveal that the global convergence set consists of 22 ethical principles and 15 challenges. Transparency, privacy, accountability and fairness are identified as the most common AI ethics principles. Similarly, lack of ethical knowledge and vague principles are reported as the significant challenges for considering ethics in AI. The findings of this study are the preliminary inputs for proposing a maturity model that assess the ethical capabilities of AI systems and provide best practices for further improvements.

Computers and Society Artificial Intelligence

Using Machine Learning to Generate Test Oracles: A Systematic Literature Review

91 - Afonso Fontes , Gregory Gay 2021

Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field. Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.

Software Engineering