No Arabic abstract
We provide accessible insight into the current replication crisis in statistical science, by revisiting the old metaphor of court trial as hypothesis test. Inter alia, we define and diagnose harmful statistical witch-hunting both in justice and science, which extends to the replication crisis itself, where a hunt on p-values is currently underway.
Increased availability of data and accessibility of computational tools in recent years have created unprecedented opportunities for scientific research driven by statistical analysis. Inherent limitations of statistics impose constrains on reliability of conclusions drawn from data but misuse of statistical methods is a growing concern. Significance, hypothesis testing and the accompanying P-values are being scrutinized as representing most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of scientific hypothesiss credibility. It has also been suggested that while P-values may have their role as summary measures of effect, researchers underappreciate the degree of randomness in the P-value. High variability of P-values would suggest that having obtained a small P-value in one study, one is, nevertheless, likely to obtain a much larger P-value in a similarly powered replication study. Thus, replicability of P-value is itself questionable. To characterize P-value variability one can use prediction intervals whose endpoints reflect the likely spread of P-values that could have been obtained by a replication study. Unfortunately, the intervals currently in use, the P-intervals, are based on unrealistic implicit assumptions. Namely, P-intervals are constructed with the assumptions that imply substantial chances of encountering large values of effect size in an observational study, which leads to bias. As an alternative to P-intervals, we develop a method that gives researchers flexibility by providing them with the means to control these assumptions. Unlike endpoints of P-intervals, endpoints of our intervals are directly interpreted as probabilistic bounds for replication P-values and are resistant to selection bias contingent upon approximate prior knowledge of the effect size distribution.
The field of data science currently enjoys a broad definition that includes a wide array of activities which borrow from many other established fields of study. Having such a vague characterization of a field in the early stages might be natural, but over time maintaining such a broad definition becomes unwieldy and impedes progress. In particular, the teaching of data science is hampered by the seeming need to cover many different points of interest. Data scientists must ultimately identify the core of the field by determining what makes the field unique and what it means to develop new knowledge in data science. In this review we attempt to distill some core ideas from data science by focusing on the iterative process of data analysis and develop some generalizations from past experience. Generalizations of this nature could form the basis of a theory of data science and would serve to unify and scale the teaching of data science to large audiences.
The role of probability appears unchallenged as the key measure of uncertainty, used among other things for practical induction in the empirical sciences. Yet, Popper was emphatic in his rejection of inductive probability and of the logical probability of hypotheses; furthermore, for him, the degree of corroboration cannot be a probability. Instead he proposed a deductive method of testing. In many ways this dialectic tension has many parallels in statistics, with the Bayesians on logico-inductive side vs the non-Bayesians or the frequentists on the other side. Simplistically Popper seems to be on the frequentist side, but recent synthesis on the non-Bayesian side might direct the Popperian views to a more nuanced destination. Logical probability seems perfectly suited to measure partial evidence or support, so what can we use if we are to reject it? For the past 100 years, statisticians have also developed a related concept called likelihood, which has played a central role in statistical modelling and inference. Remarkably, this Fisherian concept of uncertainty is largely unknown or at least severely under-appreciated in non-statistical literature. As a measure of corroboration, the likelihood satisfies the Popperian requirement that it is not a probability. Our aim is to introduce the likelihood and its recent extension via a discussion of two well-known logical fallacies in order to highlight that its lack of recognition may have led to unnecessary confusion in our discourse about falsification and corroboration of hypotheses. We highlight the 100 years of development of likelihood concepts. The year 2021 will mark the 100-year anniversary of the likelihood, so with this paper we wish it a long life and increased appreciation in non-statistical literature.
Donohos JCGS (in press) paper is a spirited call to action for statisticians, who he points out are losing ground in the field of data science by refusing to accept that data science is its own domain. (Or, at least, a domain that is becoming distinctly defined.) He calls on writings by John Tukey, Bill Cleveland, and Leo Breiman, among others, to remind us that statisticians have been dealing with data science for years, and encourages acceptance of the direction of the field while also ensuring that statistics is tightly integrated. As faculty at baccalaureate institutions (where the growth of undergraduate statistics programs has been dramatic), we are keen to ensure statistics has a place in data science and data science education. In his paper, Donoho is primarily focused on graduate education. At our undergraduate institutions, we are considering many of the same questions.
Crisis informetrics is considered to be a relatively new and emerging area of research, which deals with the application of analytical approaches of network and information science combined with experimental learning approaches of statistical mechanics to explore communication and information flow, robustness as well as tolerance of complex crisis networks under threats. In this paper, we discuss the scale free network property of an organizational communication network and test both traditional (static) and dynamic topology of social networks during organizational crises Both types of topologies exhibit similar characteristics of prominent actors reinforcing the power law distribution nature of scale free networks. There are no significant fluctuations among the actor prominence in daily and aggregated networks. We found that email communication network display a high degree of scale free behavior described by power law.