No Arabic abstract
Weighted log-rank tests are arguably the most widely used tests by practitioners for the two-sample problem in the context of right-censored data. Many approaches have been considered to make weighted log-rank tests more robust against a broader family of alternatives, among them, considering linear combinations of weighted log-rank tests, and taking the maximum among a finite collection of them. In this paper, we propose as test statistic the supremum of a collection of (potentially infinite) weight-indexed log-rank tests where the index space is the unit ball in a reproducing kernel Hilbert space (RKHS). By using some desirable properties of RKHSs we provide an exact and simple evaluation of the test statistic and establish connections with previous tests in the literature. Additionally, we show that for a special family of RKHSs, the proposed test is omnibus. We finalise by performing an empirical evaluation of the proposed methodology and show an application to a real data scenario. Our theoretical results are proved using techniques for double integrals with respect to martingales that may be of independent interest.
We introduce a general non-parametric independence test between right-censored survival times and covariates, which may be multivariate. Our test statistic has a dual interpretation, first in terms of the supremum of a potentially infinite collection of weight-indexed log-rank tests, with weight functions belonging to a reproducing kernel Hilbert space (RKHS) of functions; and second, as the norm of the difference of embeddings of certain finite measures into the RKHS, similar to the Hilbert-Schmidt Independence Criterion (HSIC) test-statistic. We study the asymptotic properties of the test, finding sufficient conditions to ensure our test correctly rejects the null hypothesis under any alternative. The test statistic can be computed straightforwardly, and the rejection threshold is obtained via an asymptotically consistent Wild Bootstrap procedure. Extensive simulations demonstrate that our testing procedure generally performs better than competing approaches in detecting complex non-linear dependence.
This paper develops a frequentist solution to the functional calibration problem, where the value of a calibration parameter in a computer model is allowed to vary with the value of control variables in the physical system. The need of functional calibration is motivated by engineering applications where using a constant calibration parameter results in a significant mismatch between outputs from the computer model and the physical experiment. Reproducing kernel Hilbert spaces (RKHS) are used to model the optimal calibration function, defined as the functional relationship between the calibration parameter and control variables that gives the best prediction. This optimal calibration function is estimated through penalized least squares with an RKHS-norm penalty and using physical data. An uncertainty quantification procedure is also developed for such estimates. Theoretical guarantees of the proposed method are provided in terms of prediction consistency and consistency of estimating the optimal calibration function. The proposed method is tested using both real and synthetic data and exhibits more robust performance in prediction and uncertainty quantification than the existing parametric functional calibration method and a state-of-art Bayesian method.
We consider settings in which the data of interest correspond to pairs of ordered times, e.g, the birth times of the first and second child, the times at which a new user creates an account and makes the first purchase on a website, and the entry and survival times of patients in a clinical trial. In these settings, the two times are not independent (the second occurs after the first), yet it is still of interest to determine whether there exists significant dependence {em beyond} their ordering in time. We refer to this notion as quasi-(in)dependence. For instance, in a clinical trial, to avoid biased selection, we might wish to verify that recruitment times are quasi-independent of survival times, where dependencies might arise due to seasonal effects. In this paper, we propose a nonparametric statistical test of quasi-independence. Our test considers a potentially infinite space of alternatives, making it suitable for complex data where the nature of the possible quasi-dependence is not known in advance. Standard parametric approaches are recovered as special cases, such as the classical conditional Kendalls tau, and log-rank tests. The tests apply in the right-censored setting: an essential feature in clinical trials, where patients can withdraw from the study. We provide an asymptotic analysis of our test-statistic, and demonstrate in experiments that our test obtains better power than existing approaches, while being more computationally efficient.
The goal of nonparametric regression is to recover an underlying regression function from noisy observations, under the assumption that the regression function belongs to a pre-specified infinite dimensional function space. In the online setting, when the observations come in a stream, it is generally computationally infeasible to refit the whole model repeatedly. There are as of yet no methods that are both computationally efficient and statistically rate-optimal. In this paper, we propose an estimator for online nonparametric regression. Notably, our estimator is an empirical risk minimizer (ERM) in a deterministic linear space, which is quite different from existing methods using random features and functional stochastic gradient. Our theoretical analysis shows that this estimator obtains rate-optimal generalization error when the regression function is known to live in a reproducing kernel Hilbert space. We also show, theoretically and empirically, that the computational expense of our estimator is much lower than other rate-optimal estimators proposed for this online setting.
The geometry of spaces with indefinite inner product, known also as Krein spaces, is a basic tool for developing Operator Theory therein. In the present paper we establish a link between this geometry and the algebraic theory of *-semigroups. It goes via the positive definite functions and related to them reproducing kernel Hilbert spaces. Our concern is in describing properties of elements of the semigroup which determine shift operators which serve as Pontryagin fundamental symmetries