Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Estimating the number of classes

310 0 0.0 ( 0 )

Download Cite

Added by Chang Xuan Mao

Publication date 2007

fields Mathematical Statistics

and research's language is English

Authors Chang Xuan Mao - Bruce G. Lindsay

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Estimating the unknown number of classes in a population has numerous important applications. In a Poisson mixture model, the problem is reduced to estimating the odds that a class is undetected in a sample. The discontinuity of the odds prevents the existence of locally unbiased and informative estimators and restricts confidence intervals to be one-sided. Confidence intervals for the number of classes are also necessarily one-sided. A sequence of lower bounds to the odds is developed and used to define pseudo maximum likelihood estimators for the number of classes.

rate research

Optimal variance estimation without estimating the mean function

632 - Tiejun Tong , Yanyuan Ma , Yuedong Wang 2013

We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared differences of these paired observations on the squared distances between paired covariates via a simple least squares procedure, the resulting variance estimator is not only asymptotically normal and root-$n$ consistent, but also reaches the optimal bound in terms of estimation variance. We also demonstrate the advantage of the least squares estimator in comparison with existing methods in terms of the second order asymptotic properties.

Statistics Theory Statistics Theory

Estimating a bivariate linear relationship

532 - David Leonard 2012

Solutions of the bivariate, linear errors-in-variables estimation problem with unspecified errors are expected to be invariant under interchange and scaling of the coordinates. The appealing model of normally distributed true values and errors is unidentified without additional information. I propose a prior density that incorporates the fact that the slope and variance parameters together determine the covariance matrix of the unobserved true values but is otherwise diffuse. The marginal posterior density of the slope is invariant to interchange and scaling of the coordinates and depends on the data only through the sample correlation coefficient and ratio of standard deviations. It covers the interval between the two ordinary least squares estimates but diminishes rapidly outside of it. I introduce the R package leiv for computing the posterior density, and I apply it to examples in astronomy and method comparison.

Statistics Theory Statistics Theory

Theoretical analysis of cross-validation for estimating the risk of the k-Nearest Neighbor classifier

72 - Alain Celisse 2015

The present work aims at deriving theoretical guaranties on the behavior of some cross-validation procedures applied to the $k$-nearest neighbors ($k$NN) rule in the context of binary classification. Here we focus on the leave-$p$-out cross-validation (L$p$O) used to assess the performance of the $k$NN classifier. Remarkably this L$p$O estimator can be efficiently computed in this context using closed-form formulas derived by cite{CelisseMaryHuard11}. We describe a general strategy to derive moment and exponential concentration inequalities for the L$p$O estimator applied to the $k$NN classifier. Such results are obtained first by exploiting the connection between the L$p$O estimator and U-statistics, and second by making an intensive use of the generalized Efron-Stein inequality applied to the L$1$O estimator. One other important contribution is made by deriving new quantifications of the discrepancy between the L$p$O estimator and the classification error/risk of the $k$NN classifier. The optimality of these bounds is discussed by means of several lower bounds as well as simulation experiments.

Statistics Theory Statistics Theory

A note on optimal designs for estimating the slope of a polynomial regression

103 - Holger Dette , Viatcheslav B. Melas , Petr Shpilev 2020

In this note we consider the optimal design problem for estimating the slope of a polynomial regression with no intercept at a given point, say z. In contrast to previous work, which considers symmetric design spaces we investigate the model on the interval $[0, a]$ and characterize those values of $z$, where an explicit solution of the optimal design is possible.

Statistics Theory Statistics Theory

Estimating the Rate Constant from Biosensor Data via an Adaptive Variational Bayesian Approach

63 - Y. Zhang , Z. Yao , P. Forssen 2019

The means to obtain the rate constants of a chemical reaction is a fundamental open problem in both science and the industry. Traditional techniques for finding rate constants require either chemical modifications of the reactants or indirect measurements. The rate constant map method is a modern technique to study binding equilibrium and kinetics in chemical reactions. Finding a rate constant map from biosensor data is an ill-posed inverse problem that is usually solved by regularization. In this work, rather than finding a deterministic regularized rate constant map that does not provide uncertainty quantification of the solution, we develop an adaptive variational Bayesian approach to estimate the distribution of the rate constant map, from which some intrinsic properties of a chemical reaction can be explored, including information about rate constants. Our new approach is more realistic than the existing approaches used for biosensors and allows us to estimate the dynamics of the interactions, which are usually hidden in a deterministic approximate solution. We verify the performance of the new proposed method by numerical simulations, and compare it with the Markov chain Monte Carlo algorithm. The results illustrate that the variational method can reliably capture the posterior distribution in a computationally efficient way. Finally, the developed method is also tested on the real biosensor data (parathyroid hormone), where we provide two novel analysis tools~-- the thresholding contour map and the high order moment map -- to estimate the number of interactions as well as their rate constants.

Statistics Theory Statistics Theory

comments

Fetching comments

Private Arab University of Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Estimating the number of classes

Ask ChatGPT about the research

No Arabic abstract

Read More