Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A data driven trimming procedure for robust classification

190 0 0.0 ( 0 )

Download Cite

Added by Jean-Michel Loubes

Publication date 2017

fields Mathematical Statistics

and research's language is English

Authors Marina Antolin - Eustasio Del Barrio - Jean-Michel Loubes

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Classification rules can be severely affected by the presence of disturbing observations in the training sample. Looking for an optimal classifier with such data may lead to unnecessarily complex rules. So, simpler effective classification rules could be achieved if we relax the goal of fitting a good rule for the whole training sample but only consider a fraction of the data. In this paper we introduce a new method based on trimming to produce classification rules with guaranteed performance on a significant fraction of the data. In particular, we provide an automatic way of determining the right trimming proportion and obtain in this setting oracle bounds for the classification error on the new data set.

rate research

A Scan Procedure for Multiple Testing

77 - Shiyun Chen , Andrew Ying , Ery Arias-Castro 2018

In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans all intervals. In parallel with citep*{storey2004strong}, we show that this scan procedure provides strong control of asymptotic false discovery rate. In addition, we investigate its asymptotic false non-discovery rate, deriving conditions under which it outperforms the Benjamini-Hochberg procedure. For example, the scan procedure is superior in power-law location models.

Statistics Theory Statistics Theory

Asymptotic Analysis for Data-Driven Inventory Policies

94 - Xun Zhang , Zhisheng Ye , William B. Haskell 2020

We study periodic review stochastic inventory control in the data-driven setting, in which the retailer makes ordering decisions based only on historical demand observations without any knowledge of the probability distribution of the demand. Since an $(s, S)$-policy is optimal when the demand distribution is known, we investigate the statistical properties of the data-driven $(s, S)$-policy obtained by recursively computing the empirical cost-to-go functions. This policy is inherently challenging to analyze because the recursion induces propagation of the estimation error backwards in time. In this work, we establish the asymptotic properties of this data-driven policy by fully accounting for the error propagation. First, we rigorously show the consistency of the estimated parameters by filling in some gaps (due to unaccounted error propagation) in the existing studies. On the other hand, empirical process theory cannot be directly applied to show asymptotic normality. To explain, the empirical cost-to-go functions for the estimated parameters are not i.i.d. sums, again due to the error propagation. Our main methodological innovation comes from an asymptotic representation for multi-sample $U$-processes in terms of i.i.d. sums. This representation enables us to apply empirical process theory to derive the influence functions of the estimated parameters and establish joint asymptotic normality. Based on these results, we also propose an entirely data-driven estimator of the optimal expected cost and we derive its asymptotic distribution. We demonstrate some useful applications of our asymptotic results, including sample size determination, as well as interval estimation and hypothesis testing on vital parameters of the inventory problem. The results from our numerical simulations conform to our theoretical analysis.

Statistics Theory Statistics Theory

A robust approach for principal component analyisis

304 - Maria Camila Vasquez-Correa , Henry Laniado Rodas 2019

In this paper we analyze different ways of performing principal component analysis throughout three different approaches: robust covariance and correlation matrix estimation, projection pursuit approach and non-parametric maximum entropy algorithm. The objective of these approaches is the correction of the well known sensitivity to outliers of the classical method for principal component analysis. Due to their robustness, they perform very well in contaminated data, while the classical approach fails to preserve the characteristics of the core information.

Statistics Theory Statistics Theory

A new test procedure of independence in copula models via chi-square-divergence

545 - Salim Bouzebda 2011

We introduce a new test procedure of independence in the framework of parametric copulas with unknown marginals. The method is based essentially on the dual representation of $chi^2$-divergence on signed finite measures. The asymptotic properties of the proposed estimate and the test statistic are studied under the null and alternative hypotheses, with simple and standard limit distributions both when the parameter is an interior point or not.

Statistics Theory Statistics Theory

Doubly robust estimation for conditional treatment effect: a study on asymptotics

129 - Chuyun Ye , Keli Guo , Lixing Zhu 2020

In this paper, we apply doubly robust approach to estimate, when some covariates are given, the conditional average treatment effect under parametric, semiparametric and nonparametric structure of the nuisance propensity score and outcome regression models. We then conduct a systematic study on the asymptotic distributions of nine estimators with different combinations of estimated propensity score and outcome regressions. The study covers the asymptotic properties with all models correctly specified; with either propensity score or outcome regressions locally / globally misspecified; and with all models locally / globally misspecified. The asymptotic variances are compared and the asymptotic bias correction under model-misspecification is discussed. The phenomenon that the asymptotic variance, with model-misspecification, could sometimes be even smaller than that with all models correctly specified is explored. We also conduct a numerical study to examine the theoretical results.

Statistics Theory Statistics Theory

comments

Fetching comments

Mamoun Private University For Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A data driven trimming procedure for robust classification

Ask ChatGPT about the research

No Arabic abstract

Read More