Naive Bayes classifiers have proven to be useful in many prediction problems with complete training data. Here we consider the situation where a naive Bayes classifier is trained with data where the response is right censored. Such prediction problems are for instance encountered in profiling systems used at National Employment Agencies. In this paper we propose the maximum collective conditional likelihood estimator for the prediction and show that it is strongly consistent under the usual identifiability condition.
We propose a censored quantile regression estimator motivated by unbiased estimating equations. Under the usual conditional independence assumption of the survival time and the censoring time given the covariates, we show that the proposed estimator is consistent and asymptotically normal. We develop an efficient computational algorithm which uses existing quantile regression code. As a result, bootstrap-type inference can be efficiently implemented. We illustrate the finite-sample performance of the proposed method by simulation studies and analysis of a survival data set.
Bayes classifiers for functional data pose a challenge. This is because probability density functions do not exist for functional data. As a consequence, the classical Bayes classifier using density quotients needs to be modified. We propose to use density ratios of projections on a sequence of eigenfunctions that are common to the groups to be classified. The density ratios can then be factored into density ratios of individual functional principal components whence the classification problem is reduced to a sequence of nonparametric one-dimensional density estimates. This is an extension to functional data of some of the very earliest nonparametric Bayes classifiers that were based on simple density ratios in the one-dimensional case. By means of the factorization of the density quotients the curse of dimensionality that would otherwise severely affect Bayes classifiers for functional data can be avoided. We demonstrate that in the case of Gaussian functional data, the proposed functional Bayes classifier reduces to a functional version of the classical quadratic discriminant. A study of the asymptotic behavior of the proposed classifiers in the large sample limit shows that under certain conditions the misclassification rate converges to zero, a phenomenon that has been referred to as perfect classification. The proposed classifiers also perform favorably in finite sample applications, as we demonstrate in comparisons with other functional classifiers in simulations and various data applications, including wine spectral data, functional magnetic resonance imaging (fMRI) data for attention deficit hyperactivity disorder (ADHD) patients, and yeast gene expression data.
In the class of normal regression models with a finite number of regressors, and for a wide class of prior distributions, a Bayesian model selection procedure based on the Bayes factor is consistent [Casella and Moreno J. Amer. Statist. Assoc. 104 (2009) 1261--1271]. However, in models where the number of parameters increases as the sample size increases, properties of the Bayes factor are not totally understood. Here we study consistency of the Bayes factors for nested normal linear models when the number of regressors increases with the sample size. We pay attention to two successful tools for model selection [Schwarz Ann. Statist. 6 (1978) 461--464] approximation to the Bayes factor, and the Bayes factor for intrinsic priors [Berger and Pericchi J. Amer. Statist. Assoc. 91 (1996) 109--122, Moreno, Bertolino and Racugno J. Amer. Statist. Assoc. 93 (1998) 1451--1460]. We find that the the Schwarz approximation and the Bayes factor for intrinsic priors are consistent when the rate of growth of the dimension of the bigger model is $O(n^b)$ for $b<1$. When $b=1$ the Schwarz approximation is always inconsistent under the alternative while the Bayes factor for intrinsic priors is consistent except for a small set of alternative models which is characterized.
In this paper we investigate the estimation of the unknown parameters of a competing risk model based on a Weibull distributed decreasing failure rate and an exponentially distributed constant failure rate, under right censored data.likelihood estimators.
Two-sample tests have been one of the most classical topics in statistics with wide application even in cutting edge applications. There are at least two modes of inference used to justify the two-sample tests. One is usual superpopulation inference assuming the units are independent and identically distributed (i.i.d.) samples from some superpopulation; the other is finite population inference that relies on the random assignments of units into different groups. When randomization is actually implemented, the latter has the advantage of avoiding distributional assumptions on the outcomes. In this paper, we will focus on finite population inference for censored outcomes, which has been less explored in the literature. Moreover, we allow the censoring time to depend on treatment assignment, under which exact permutation inference is unachievable. We find that, surprisingly, the usual logrank test can also be justified by randomization. Specifically, under a Bernoulli randomized experiment with non-informative i.i.d. censoring within each treatment arm, the logrank test is asymptotically valid for testing Fishers null hypothesis of no treatment effect on any unit. Moreover, the asymptotic validity of the logrank test does not require any distributional assumption on the potential event times. We further extend the theory to the stratified logrank test, which is useful for randomized blocked designs and when censoring mechanisms vary across strata. In sum, the developed theory for the logrank test from finite population inference supplements its classical theory from usual superpopulation inference, and helps provide a broader justification for the logrank test.