No Arabic abstract
Online forecasting under a changing environment has been a problem of increasing importance in many real-world applications. In this paper, we consider the meta-algorithm presented in citet{zhang2017dynamic} combined with different subroutines. We show that an expected cumulative error of order $tilde{O}(n^{1/3} C_n^{2/3})$ can be obtained for non-stationary online linear regression where the total variation of parameter sequence is bounded by $C_n$. Our paper extends the result of online forecasting of one-dimensional time-series as proposed in cite{baby2019online} to general $d$-dimensional non-stationary linear regression. We improve the rate $O(sqrt{n C_n})$ obtained by Zhang et al. 2017 and Besbes et al. 2015. We further extend our analysis to non-stationary online kernel regression. Similar to the non-stationary online regression case, we use the meta-procedure of Zhang et al. 2017 combined with Kernel-AWV (Jezequel et al. 2020) to achieve an expected cumulative controlled by the effective dimension of the RKHS and the total variation of the sequence. To the best of our knowledge, this work is the first extension of non-stationary online regression to non-stationary kernel regression. Lastly, we evaluate our method empirically with several existing benchmarks and also compare it with the theoretical bound obtained in this paper.
We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B. It is known (see [Hazan et al., 2014]) that any proper algorithm which has logarithmic regret in the number of samples (denoted n) necessarily suffers an exponential multiplicative constant in B. In this work, we design an efficient improper algorithm that avoids this exponential constant while preserving a logarithmic regret. Indeed, [Foster et al., 2018] showed that the lower bound does not apply to improper algorithms and proposed a strategy based on exponential weights with prohibitive computational complexity. Our new algorithm based on regularized empirical risk minimization with surrogate losses satisfies a regret scaling as O(B log(Bn)) with a per-round time-complexity of order O(d^2).
It has been recently shown in the literature that the sample averages from online learning experiments are biased when used to estimate the mean reward. To correct the bias, off-policy evaluation methods, including importance sampling and doubly robust estimators, typically calculate the propensity score, which is unavailable in this setting due to unknown reward distribution and the adaptive policy. This paper provides a procedure to debias the samples using bootstrap, which doesnt require the knowledge of the reward distribution at all. Numerical experiments demonstrate the effective bias reduction for samples generated by popular multi-armed bandit algorithms such as Explore-Then-Commit (ETC), UCB, Thompson sampling and $epsilon$-greedy. We also analyze and provide theoretical justifications for the procedure under the ETC algorithm, including the asymptotic convergence of the bias decay rate in the real and bootstrap worlds.
Distributed computing offers a high degree of flexibility to accommodate modern learning constraints and the ever increasing size of datasets involved in massive data issues. Drawing inspiration from the theory of distributed computation models developed in the context of gradient-type optimization algorithms, we present a consensus-based asynchronous distributed approach for nonparametric online regression and analyze some of its asymptotic properties. Substantial numerical evidence involving up to 28 parallel processors is provided on synthetic datasets to assess the excellent performance of our method, both in terms of computation time and prediction accuracy.
We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total squared loss compared against the best isotonic (non-decreasing) function in hindsight. We survey several standard online learning algorithms and show that none of them achieve the optimal regret exponent; in fact, most of them (including Online Gradient Descent, Follow the Leader and Exponential Weights) incur linear regret. We then prove that the Exponential Weights algorithm played over a covering net of isotonic functions has a regret bounded by $Obig(T^{1/3} log^{2/3}(T)big)$ and present a matching $Omega(T^{1/3})$ lower bound on regret. We provide a computationally efficient version of this algorithm. We also analyze the noise-free case, in which the revealed labels are isotonic, and show that the bound can be improved to $O(log T)$ or even to $O(1)$ (when the labels are revealed in isotonic order). Finally, we extend the analysis beyond squared loss and give bounds for entropic loss and absolute loss.
Single index models provide an effective dimension reduction tool in regression, especially for high dimensional data, by projecting a general multivariate predictor onto a direction vector. We propose a novel single-index model for regression models where metric space-valued random object responses are coupled with multivariate Euclidean predictors. The responses in this regression model include complex, non-Euclidean data, including covariance matrices, graph Laplacians of networks, and univariate probability distribution functions among other complex objects that lie in abstract metric spaces. Frechet regression has provided an approach for modeling the conditional mean of such random objects given multivariate Euclidean vectors, but it does not provide for regression parameters such as slopes or intercepts, since the metric space-valued responses are not amenable to linear operations. We show here that for the case of multivariate Euclidean predictors, the parameters that define a single index and associated projection vector can be used to substitute for the inherent absence of parameters in Frechet regression. Specifically, we derive the asymptotic consistency of suitable estimates of these parameters subject to an identifiability condition. Consistent estimation of the link function of the single index Frechet regression model is obtained through local Frechet regression. We demonstrate the finite sample performance of estimation for the proposed single index Frechet regression model through simulation studies, including the special cases of probability distributions and graph adjacency matrices. The method is also illustrated for resting-state functional Magnetic Resonance Imaging (fMRI) data from the ADNI study.