بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Revisiting Empirical Bayes Methods and Applications to Special Types of Data

121 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xiuwen Duan

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Xiuwen Duan

المنهجية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Empirical Bayes methods have been around for a long time and have a wide range of applications. These methods provide a way in which historical data can be aggregated to provide estimates of the posterior mean. This thesis revisits some of the empirical Bayesian methods and develops new applications. We first look at a linear empirical Bayes estimator and apply it on ranking and symbolic data. Next, we consider Tweedies formula and show how it can be applied to analyze a microarray dataset. The application of the formula is simplified with the Pearson system of distributions. Saddlepoint approximations enable us to generalize several results in this direction. The results show that the proposed methods perform well in applications to real data sets.

قيم البحث

64 - Luella J. Fu , Gareth M. James , Wenguang Sun 2020

The simultaneous estimation of many parameters $eta_i$, based on a corresponding set of observations $x_i$, for $i=1,ldots, n$, is a key research problem that has received renewed attention in the high-dimensional setting. %The classic example involv es estimating a vector of normal means $mu_i$ subject to a fixed variance term $sigma^2$. However, Many practical situations involve heterogeneous data $(x_i, theta_i)$ where $theta_i$ is a known nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale estimation problems. We address this issue by introducing the Nonparametric Empirical Bayes Smoothing Tweedie (NEST) estimator, which efficiently estimates $eta_i$ and properly adjusts for heterogeneity %by approximating the marginal density of the data $f_{theta_i}(x_i)$ and applying this density to via a generalized version of Tweedies formula. NEST is capable of handling a wider range of settings than previously proposed heterogeneous approaches as it does not make any parametric assumptions on the prior distribution of $eta_i$. The estimation framework is simple but general enough to accommodate any member of the exponential family of distributions. %; a thorough study of the normal means problem subject to heterogeneous variances is presented to illustrate the proposed framework. Our theoretical results show that NEST is asymptotically optimal, while simulation studies show that it outperforms competing methods, with substantial efficiency gains in many settings. The method is demonstrated on a data set measuring the performance gap in math scores between socioeconomically advantaged and disadvantaged students in K-12 schools.

المنهجية

A novel sandwich algorithm for empirical Bayes analysis of rank data

56 - Arnab Kumar Laha , Somak Dutta , Vivekananda Roy 2017

Rank data arises frequently in marketing, finance, organizational behavior, and psychology. Most analysis of rank data reported in the literature assumes the presence of one or more variables (sometimes latent) based on whose values the items are ran ked. In this paper we analyze rank data using a purely probabilistic model where the observed ranks are assumed to be perturbe

المنهجية

Nonparametric empirical Bayes and maximum likelihood estimation for high-dimensional data analysis

343 - Lee H. Dicker , Sihai D. Zhao 2014

Nonparametric empirical Bayes methods provide a flexible and attractive approach to high-dimensional data analysis. One particularly elegant empirical Bayes methodology, involving the Kiefer-Wolfowitz nonparametric maximum likelihood estimator (NPMLE ) for mixture models, has been known for decades. However, implementation and theoretical analysis of the Kiefer-Wolfowitz NPMLE are notoriously difficult. A fast algorithm was recently proposed that makes NPMLE-based procedures feasible for use in large-scale problems, but the algorithm calculates only an approximation to the NPMLE. In this paper we make two contributions. First, we provide upper bounds on the convergence rate of the approximate NPMLEs statistical error, which have the same order as the best known bounds for the true NPMLE. This suggests that the approximate NPMLE is just as effective as the true NPMLE for statistical applications. Second, we illustrate the promise of NPMLE procedures in a high-dimensional binary classification problem. We propose a new procedure and show that it vastly outperforms existing methods in experiments with simulated data. In real data analyses involving cancer survival and gene expression data, we show that it is very competitive with several recently proposed methods for regularized linear discriminant analysis, another popular approach to high-dimensional classification.

المنهجية

Empirical Bayes approaches to PageRank type algorithms for rating scientific journals

126 - Jean-Louis Foulley , Gilles Celeux , Julie Josse 2017

Following criticisms against the journal Impact Factor, new journal influence scores have been developed such as the Eigenfactor or the Prestige Scimago Journal Rank. They are based on PageRank type algorithms on the cross-citations transition matrix of the citing-cited network. The PageRank algorithm performs a smoothing of the transition matrix combining a random walk on the data network and a teleportation to all possible nodes with fixed probabilities (the damping factor being $alpha= 0.85$). We reinterpret this smoothing matrix as the mean of a posterior distribution of a Dirichlet-multinomial model in an empirical Bayes perspective. We suggest a simple yet efficient way to make a clear distinction between structural and sampling zeroes. This allows us to contrast cases with self-citations included or excluded to avoid overvalued journal bias. We estimate the model parameters by maximizing the marginal likelihood with a Majorize-Minimize algorithm. The procedure ends up with a score similar to the PageRank ones but with a damping factor depending on each journal. The procedures are illustrated with an example about cross-citations among 47 statistical journals studied by Varin et. al. (2016).

المنهجية

Robust and Efficient Empirical Bayes Confidence Intervals using $gamma$-Divergence

76 - Daisuke Kurisu , Takuya Ishihara , Shonosuke Sugasawa 2021

Although parametric empirical Bayes confidence intervals of multiple normal means are fundamental tools for compound decision problems, their performance can be sensitive to the misspecification of the parametric prior distribution (typically normal distribution), especially when some strong signals are included. We suggest a simple modification of the standard confidence intervals such that the proposed interval is robust against misspecification of the prior distribution. Our main idea is using well-known Tweedies formula with robust likelihood based on $gamma$-divergence. An advantage of the new interval is that the interval lengths are always smaller than or equal to those of the parametric empirical Bayes confidence interval so that the new interval is efficient and robust. We prove asymptotic validity that the coverage probability of the proposed confidence intervals attain a nominal level even when the true underlying distribution of signals is contaminated, and the coverage accuracy is less sensitive to the contamination ratio. The numerical performance of the proposed method is demonstrated through simulation experiments and a real data application.

المنهجية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة وهران احمد بن بله

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Revisiting Empirical Bayes Methods and Applications to Special Types of Data

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً