ترغب بنشر مسار تعليمي؟ اضغط هنا

Dos and donts of reduced chi-squared

113   0   0.0 ( 0 )
 نشر من قبل Rene Andrae
 تاريخ النشر 2010
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Reduced chi-squared is a very popular method for model assessment, model comparison, convergence diagnostic, and error estimation in astronomy. In this manuscript, we discuss the pitfalls involved in using reduced chi-squared. There are two independent problems: (a) The number of degrees of freedom can only be estimated for linear models. Concerning nonlinear models, the number of degrees of freedom is unknown, i.e., it is not possible to compute the value of reduced chi-squared. (b) Due to random noise in the data, also the value of reduced chi-squared itself is subject to noise, i.e., the value is uncertain. This uncertainty impairs the usefulness of reduced chi-squared for differentiating between models or assessing convergence of a minimisation procedure. The impact of noise on the value of reduced chi-squared is surprisingly large, in particular for small data sets, which are very common in astrophysical problems. We conclude that reduced chi-squared can only be used with due caution for linear models, whereas it must not be used for nonlinear models at all. Finally, we recommend more sophisticated and reliable methods, which are also applicable to nonlinear models.



قيم البحث

اقرأ أيضاً

We develop a method to infer log-normal random fields from measurement data affected by Gaussian noise. The log-normal model is well suited to describe strictly positive signals with fluctuations whose amplitude varies over several orders of magnitud e. We use the formalism of minimum Gibbs free energy to derive an algorithm that uses the signals correlation structure to regularize the reconstruction. The correlation structure, described by the signals power spectrum, is thereby reconstructed from the same data set. We show that the minimization of the Gibbs free energy, corresponding to a Gaussian approximation to the posterior marginalized over the power spectrum, is equivalent to the empirical Bayes ansatz, in which the power spectrum is fixed to its maximum a posteriori value. We further introduce a prior for the power spectrum that enforces spectral smoothness. The appropriateness of this prior in different scenarios is discussed and its effects on the reconstructions results are demonstrated. We validate the performance of our reconstruction algorithm in a series of one- and two-dimensional test cases with varying degrees of non-linearity and different noise levels.
201 - K. Temme , F. Verstraete 2011
The density matrix in quantum mechanics parameterizes the statistical properties of the system under observation, just like a classical probability distribution does for classical systems. The expectation value of observables cannot be measured direc tly, it can only be approximated by applying classical statistical methods to the frequencies by which certain measurement outcomes (clicks) are obtained. In this paper, we make a detailed study of the statistical fluctuations obtained during an experiment in which a hypothesis is tested, i.e. the hypothesis that a certain setup produces a given quantum state. Although the classical and quantum problem are very much related to each other, the quantum problem is much richer due to the additional optimization over the measurement basis. Just as in the case of classical hypothesis testing, the confidence in quantum hypothesis testing scales exponentially in the number of copies. In this paper, we will argue 1) that the physically relevant data of quantum experiments is only contained in the frequencies of the measurement outcomes, and that the statistical fluctuations of the experiment are essential, so that the correct formulation of the conclusions of a quantum experiment should be given in terms of hypothesis tests, 2) that the (classical) $chi^2$ test for distinguishing two quantum states gives rise to the quantum $chi^2$ divergence when optimized over the measurement basis, 3) present a max-min characterization for the optimal measurement basis for quantum goodness of fit testing, find the quantum measurement which leads both to the maximal Pitman and Bahadur efficiency, and determine the associated divergence rates.
We consider the following general hidden hubs model: an $n times n$ random matrix $A$ with a subset $S$ of $k$ special rows (hubs): entries in rows outside $S$ are generated from the probability distribution $p_0 sim N(0,sigma_0^2)$; for each row in $S$, some $k$ of its entries are generated from $p_1 sim N(0,sigma_1^2)$, $sigma_1>sigma_0$, and the rest of the entries from $p_0$. The problem is to identify the high-degree hubs efficiently. This model includes and significantly generalizes the planted Gaussian Submatrix Model, where the special entries are all in a $k times k$ submatrix. There are two well-known barriers: if $kgeq csqrt{nln n}$, just the row sums are sufficient to find $S$ in the general model. For the submatrix problem, this can be improved by a $sqrt{ln n}$ factor to $k ge csqrt{n}$ by spectral methods or combinatorial methods. In the variant with $p_0=pm 1$ (with probability $1/2$ each) and $p_1equiv 1$, neither barrier has been broken. We give a polynomial-time algorithm to identify all the hidden hubs with high probability for $k ge n^{0.5-delta}$ for some $delta >0$, when $sigma_1^2>2sigma_0^2$. The algorithm extends to the setting where planted entries might have different variances each at least as large as $sigma_1^2$. We also show a nearly matching lower bound: for $sigma_1^2 le 2sigma_0^2$, there is no polynomial-time Statistical Query algorithm for distinguishing between a matrix whose entries are all from $N(0,sigma_0^2)$ and a matrix with $k=n^{0.5-delta}$ hidden hubs for any $delta >0$. The lower bound as well as the algorithm are related to whether the chi-squared distance of the two distributions diverges. At the critical value $sigma_1^2=2sigma_0^2$, we show that the general hidden hubs problem can be solved for $kgeq csqrt n(ln n)^{1/4}$, improving on the naive row sum-based method.
We investigate the statistics of stationary points in the sum of squares of $N$ Gaussian random fields, which we call a chi-squared field. The behavior of such a field at a point is investigated, with particular attention paid to the formation of top ological defects. An integral to compute the number density of stationary points at a given field amplitude is constructed. We compute exact expressions for the integral in various limits and provide code to evaluate it numerically in the general case. We investigate the dependence of the number density of stationary points on the field amplitude, number of fields, and power spectrum of the individual Gaussian random fields. This work parallels the work of Bardeen, Bond, Kaiser and Szalay, who investigated the statistics of peaks of Gaussian random fields. A number of results for integrating over matrices are presented in appendices.
Astronomical images from optical photometric surveys are typically contaminated with transient artifacts such as cosmic rays, satellite trails and scattered light. We have developed and tested an algorithm that removes these artifacts using a deep, a rtifact free, static sky coadd image built up through the median combination of point spread function (PSF) homogenized, overlapping single epoch images. Transient artifacts are detected and masked in each single epoch image through comparison with an artifact free, PSF-matched simulated image that is constructed using the PSF-corrected, model fitting catalog from the artifact free coadd image together with the position variable PSF model of the single epoch image. This approach works well not only for cleaning single epoch images with worse seeing than the PSF homogenized coadd, but also the traditionally much more challenging problem of cleaning single epoch images with better seeing. In addition to masking transient artifacts, we have developed an interpolation approach that uses the local PSF and performs well in removing artifacts whose widths are smaller than the PSF full width at half maximum, including cosmic rays, the peaks of saturated stars and bleed trails. We have tested this algorithm on Dark Energy Survey Science Verification data and present performance metrics. More generally, our algorithm can be applied to any survey which images the same part of the sky multiple times.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا