ﻻ يوجد ملخص باللغة العربية
Many predictions are probabilistic in nature; for example, a prediction could be for precipitation tomorrow, but with only a 30 percent chance. Given both the predictions and the actual outcomes, reliability diagrams (also known as calibration plots) help detect and diagnose statistically significant discrepancies between the predictions and the outcomes. The canonical reliability diagrams are based on histogramming the observed and expected values of the predictions; several variants of the standard reliability diagrams propose to replace the hard histogram binning with soft kernel density estimation using smooth convolutional kernels of widths similar to the widths of the bins. In all cases, an important question naturally arises: which widths are best (or are multiple plots with different widths better)? Rather than answering this question, plots of the cumulative differences between the observed and expected values largely avoid the question, by displaying miscalibration directly as the slopes of secant lines for the graphs. Slope is easy to perceive with quantitative precision even when the constant offsets of the secant lines are irrelevant. There is no need to bin or perform kernel density estimation with a somewhat arbitrary kernel.
Comparing the differences in outcomes (that is, in dependent variables) between two subpopulations is often most informative when comparing outcomes only for individuals from the subpopulations who are similar according to independent variables. The
Hedges d, an existing unbiased effect size of the difference between means, assumes the variance equality. However, the assumption of the variance equality is fragile, and is often violated in practical applications. Here, we define e, a new effect s
It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC. These methods use a noisy estimate of the gradient of the log posterior, whic
The detection of differentially expressed (DE) genes is one of the most commonly studied problems in bioinformatics. For example, the identification of DE genes between distinct disease phenotypes is an important first step in understanding and devel