ﻻ يوجد ملخص باللغة العربية
Mutation testing is a well-established technique for assessing a test suites quality by injecting artificial faults into production code. In recent years, mutation testing has been extended to machine learning (ML) systems, and deep learning (DL) in particular; researchers have proposed approaches, tools, and statistically sound heuristics to determine whether mutants in DL systems are killed or not. However, as we will argue in this work, questions can be raised to what extent currently used mutation testing techniques in DL are actually in line with the classical interpretation of mutation testing. We observe that ML model development resembles a test-driven development (TDD) process, in which a training algorithm (`programmer) generates a model (program) that fits the data points (test data) to labels (implicit assertions), up to a certain threshold. However, considering proposed mutation testing techniques for ML systems under this TDD metaphor, in current approaches, the distinction between production and test code is blurry, and the realism of mutation operators can be challenged. We also consider the fundamental hypotheses underlying classical mutation testing: the competent programmer hypothesis and coupling effect hypothesis. As we will illustrate, these hypotheses do not trivially translate to ML system development, and more conscious and explicit scoping and concept mapping will be needed to truly draw parallels. Based on our observations, we propose several action points for better alignment of mutation testing techniques for ML with paradigms and vocabularies of classical mutation testing.
Deep learning (DL) defines a new data-driven programming paradigm where the internal system logic is largely shaped by the training data. The standard way of evaluating DL models is to examine their performance on a test dataset. The quality of the t
In the field of mutation analysis, mutation is the systematic generation of mutated programs (i.e., mutants) from an original program. The concept of mutation has been widely applied to various testing problems, including test set selection, fault lo
Mutation testing is used to evaluate the effectiveness of test suites. In recent years, a promising variation called extreme mutation testing emerged that is computationally less expensive. It identifies methods where their functionality can be entir
Diversity has been proposed as a key criterion to improve testing effectiveness and efficiency.It can be used to optimise large test repositories but also to visualise test maintenance issues and raise practitioners awareness about waste in test arte
ReTest is a novel testing tool for Java applications with a graphical user interface (GUI), combining monkey testing and difference testing. Since this combination sidesteps the oracle problem, it enables the generation of GUI-based regression tests.