Distributed computing offers a high degree of flexibility to accommodate modern learning constraints and the ever increasing size of datasets involved in massive data issues. Drawing inspiration from the theory of distributed computation models devel
oped in the context of gradient-type optimization algorithms, we present a consensus-based asynchronous distributed approach for nonparametric online regression and analyze some of its asymptotic properties. Substantial numerical evidence involving up to 28 parallel processors is provided on synthetic datasets to assess the excellent performance of our method, both in terms of computation time and prediction accuracy.
Let $bX=(X_1, hdots, X_d)$ be a $mathbb R^d$-valued random vector with i.i.d. components, and let $VertbXVert_p= (sum_{j=1}^d|X_j|^p)^{1/p}$ be its $p$-norm, for $p>0$. The impact of letting $d$ go to infinity on $VertbXVert_p$ has surprising consequ
ences, which may dramatically affect high-dimensional data processing. This effect is usually referred to as the {it distance concentration phenomenon} in the computational learning literature. Despite a growing interest in this important question, previous work has essentially characterized the problem in terms of numerical experiments and incomplete mathematical statements. In the present paper, we solidify some of the arguments which previously appeared in the literature and offer new insights into the phenomenon.