ﻻ يوجد ملخص باللغة العربية
Statistics comes in two main flavors: frequentist and Bayesian. For historical and technical reasons, frequentist statistics have traditionally dominated empirical data analysis, and certainly remain prevalent in empirical software engineering. This situation is unfortunate because frequentist statistics suffer from a number of shortcomings---such as lack of flexibility and results that are unintuitive and hard to interpret---that curtail their effectiveness when dealing with the heterogeneous data that is increasingly available for empirical analysis of software engineering practice. In this paper, we pinpoint these shortcomings, and present Bayesian data analysis techniques that provide tangible benefits---as they can provide clearer results that are simultaneously robust and nuanced. After a short, high-level introduction to the basic tools of Bayesian statistics, we present the reanalysis of two empirical studies on the effectiveness of automatically generated tests and the performance of programming languages. By contrasting the original frequentist analyses with our new Bayesian analyses, we demonstrate the concrete advantages of the latter. To conclude we advocate a more prominent role for Bayesian statistical techniques in empirical software engineering research and practice.
Empirical Standards are natural-language models of a scientific communitys expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research me
Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the pra
Statistical analysis is the tool of choice to turn data into information, and then information into empirical knowledge. To be valid, the process that goes from data to knowledge should be supported by detailed, rigorous guidelines, which help ferret
Context: The utility of prediction models in empirical software engineering (ESE) is heavily reliant on the quality of the data used in building those models. Several data quality challenges such as noise, incompleteness, outliers and duplicate data
Reliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data u