No Arabic abstract
Particle physics experiments such as those run in the Large Hadron Collider result in huge quantities of data, which are boiled down to a few numbers from which it is hoped that a signal will be detected. We discuss a simple probability model for this and derive frequentist and noninformative Bayesian procedures for inference about the signal. Both are highly accurate in realistic cases, with the frequentist procedure having the edge for interval estimation, and the Bayesian procedure yielding slightly better point estimates. We also argue that the significance, or $p$-value, function based on the modified likelihood root provides a comprehensive presentation of the information in the data and should be used for inference.
Spatial prediction of weather-elements like temperature, precipitation, and barometric pressure are generally based on satellite imagery or data collected at ground-stations. None of these data provide information at a more granular or hyper-local resolution. On the other hand, crowdsourced weather data, which are captured by sensors installed on mobile devices and gathered by weather-related mobile apps like WeatherSignal and AccuWeather, can serve as potential data sources for analyzing environmental processes at a hyper-local resolution. However, due to the low quality of the sensors and the non-laboratory environment, the quality of the observations in crowdsourced data is compromised. This paper describes methods to improve hyper-local spatial prediction using this varying-quality noisy crowdsourced information. We introduce a reliability metric, namely Veracity Score (VS), to assess the quality of the crowdsourced observations using a coarser, but high-quality, reference data. A VS-based methodology to analyze noisy spatial data is proposed and evaluated through extensive simulations. The merits of the proposed approach are illustrated through case studies analyzing crowdsourced daily average ambient temperature readings for one day in the contiguous United States.
In this article we derive an unbiased expression for the expected mean-squared error associated with continuously differentiable estimators of the noncentrality parameter of a chi-square random variable. We then consider the task of denoising squared-magnitude magnetic resonance image data, which are well modeled as independent noncentral chi-square random variables on two degrees of freedom. We consider two broad classes of linearly parameterized shrinkage estimators that can be optimized using our risk estimate, one in the general context of undecimated filterbank transforms, and another in the specific case of the unnormalized Haar wavelet transform. The resultant algorithms are computationally tractable and improve upon state-of-the-art methods for both simulated and actual magnetic resonance image data.
How should social scientists understand and communicate the uncertainty of statistically estimated causal effects? It is well-known that the conventional significance-vs.-insignificance approach is associated with misunderstandings and misuses. Behavioral research suggests people understand uncertainty more appropriately in a numerical, continuous scale than in a verbal, discrete scale. Motivated by these backgrounds, I propose presenting the probabilities of different effect sizes. Probability is an intuitive continuous measure of uncertainty. It allows researchers to better understand and communicate the uncertainty of statistically estimated effects. In addition, my approach needs no decision threshold for an uncertainty measure or an effect size, unlike the conventional approaches, allowing researchers to be agnostic about a decision threshold such as p<5% and a justification for that. I apply my approach to a previous social scientific study, showing it enables richer inference than the significance-vs.-insignificance approach taken by the original study. The accompanying R package makes my approach easy to implement.
The determination of the infection fatality rate (IFR) for the novel SARS-CoV-2 coronavirus is a key aim for many of the field studies that are currently being undertaken in response to the pandemic. The IFR together with the basic reproduction number $R_0$, are the main epidemic parameters describing severity and transmissibility of the virus, respectively. The IFR can be also used as a basis for estimating and monitoring the number of infected individuals in a population, which may be subsequently used to inform policy decisions relating to public health interventions and lockdown strategies. The interpretation of IFR measurements requires the calculation of confidence intervals. We present a number of statistical methods that are relevant in this context and develop an inverse problem formulation to determine correction factors to mitigate time-dependent effects that can lead to biased IFR estimates. We also review a number of methods to combine IFR estimates from multiple independent studies, provide example calculations throughout this note and conclude with a summary and best practice recommendations. The developed code is available online.
Cryo-electron microscopy (cryo-EM) is an emerging experimental method to characterize the structure of large biomolecular assemblies. Single particle cryo-EM records 2D images (so-called micrographs) of projections of the three-dimensional particle, which need to be processed to obtain the three-dimensional reconstruction. A crucial step in the reconstruction process is particle picking which involves detection of particles in noisy 2D micrographs with low signal-to-noise ratios of typically 1:10 or even lower. Typically, each picture contains a large number of particles, and particles have unknown irregular and nonconvex shapes.