ترغب بنشر مسار تعليمي؟ اضغط هنا

Nakamoto consensus underlies the security of many of the worlds largest cryptocurrencies, such as Bitcoin and Ethereum. Common lore is that Nakamoto consensus only achieves consistency and liveness under a regime where the difficulty of its underlyin g mining puzzle is very high, negatively impacting overall throughput and latency. In this work, we study Nakamoto consensus under a wide range of puzzle difficulties, including very easy puzzles. We first analyze an adversary-free setting and show that, surprisingly, the common prefix of the blockchain grows quickly even with easy puzzles. In a setting with adversaries, we provide a small backwards-compatible change to Nakamoto consensus to achieve consistency and liveness with easy puzzles. Our insight relies on a careful choice of emph{symmetry-breaking strategy}, which was significantly underestimated in prior work. We introduce a new method -- emph{coalescing random walks} -- to analyzing the correctness of Nakamoto consensus under the uniformly-at-random symmetry-breaking strategy. This method is more powerful than existing analysis methods that focus on bounding the number of {it convergence opportunities}.
Federated Learning (FL) is a promising framework that has great potentials in privacy preservation and in lowering the computation load at the cloud. FedAvg and FedProx are two widely adopted algorithms. However, recent work raised concerns on these two methods: (1) their fixed points do not correspond to the stationary points of the original optimization problem, and (2) the common model found might not generalize well locally. In this paper, we alleviate these concerns. Towards this, we adopt the statistical learning perspective yet allow the distributions to be heterogeneous and the local data to be unbalanced. We show, in the general kernel regression setting, that both FedAvg and FedProx converge to the minimax-optimal error rates. Moreover, when the kernel function has a finite rank, the convergence is exponentially fast. Our results further analytically quantify the impact of the model heterogeneity and characterize the federation gain - the reduction of the estimation error for a worker to join the federated learning compared to the best local estimator. To the best of our knowledge, we are the first to show the achievability of minimax error rates under FedAvg and FedProx, and the first to characterize the gains in joining FL. Numerical experiments further corroborate our theoretical findings on the statistical optimality of FedAvg and FedProx and the federation gains.
116 - Lili Su , Pengkun Yang 2019
We consider training over-parameterized two-layer neural networks with Rectified Linear Unit (ReLU) using gradient descent (GD) method. Inspired by a recent line of work, we study the evolutions of network prediction errors across GD iterations, whic h can be neatly described in a matrix form. When the network is sufficiently over-parameterized, these matrices individually approximate {em an} integral operator which is determined by the feature vector distribution $rho$ only. Consequently, GD method can be viewed as {em approximately} applying the powers of this integral operator on the underlying/target function $f^*$ that generates the responses/labels. We show that if $f^*$ admits a low-rank approximation with respect to the eigenspaces of this integral operator, then the empirical risk decreases to this low-rank approximation error at a linear rate which is determined by $f^*$ and $rho$ only, i.e., the rate is independent of the sample size $n$. Furthermore, if $f^*$ has zero low-rank approximation error, then, as long as the width of the neural network is $Omega(nlog n)$, the empirical risk decreases to $Theta(1/sqrt{n})$. To the best of our knowledge, this is the first result showing the sufficiency of nearly-linear network over-parameterization. We provide an application of our general results to the setting where $rho$ is the uniform distribution on the spheres and $f^*$ is a polynomial. Throughout this paper, we consider the scenario where the input dimension $d$ is fixed.
Winner-Take-All (WTA) refers to the neural operation that selects a (typically small) group of neurons from a large neuron pool. It is conjectured to underlie many of the brains fundamental computational abilities. However, not much is known about th e robustness of a spike-based WTA network to the inherent randomness of the input spike trains. In this work, we consider a spike-based $k$--WTA model wherein $n$ randomly generated input spike trains compete with each other based on their underlying statistics, and $k$ winners are supposed to be selected. We slot the time evenly with each time slot of length $1, ms$, and model the $n$ input spike trains as $n$ independent Bernoulli processes. The Bernoulli process is a good approximation of the popular Poisson process but is more biologically relevant as it takes the refractory periods into account. Due to the randomness in the input spike trains, no circuits can guarantee to successfully select the correct winners in finite time. We focus on analytically characterizing the minimal amount of time needed so that a target minimax decision accuracy (success probability) can be reached. We first derive an information-theoretic lower bound on the decision time. We show that to have a (minimax) decision error $le delta$ (where $delta in (0,1)$), the computation time of any WTA circuit is at least [ ((1-delta) log(k(n -k)+1) -1)T_{mathcal{R}}, ] where $T_{mathcal{R}}$ is a difficulty parameter of a WTA task that is independent of $delta$, $n$, and $k$. We then design a simple WTA circuit whose decision time is [ O( logfrac{1}{delta}+log k(n-k))T_{mathcal{R}}). ] It turns out that for any fixed $delta in (0,1)$, this decision time is order-optimal in terms of its scaling in $n$, $k$, and $T_{mathcal{R}}$.
We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $tto infty$) it pulls only t he arm with the highest expected reward. While this goal is provably impossible for an isolated individual due to bounded memory, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion (i.e., communication) as long as the communication networks/graphs satisfy some mild conditions. To deal with the interplay between the randomness in the rewards and in the social interaction, we employ the {em mean-field approximation} method. Considering the possibility that the individuals in the networks may not be exchangeable when the communication networks are not cliques, we go beyond the classic mean-field techniques and apply a refined version of mean-field approximation: (1) Using coupling we show that, if the communication graph is connected and is either regular or has doubly-stochastic degree-weighted adjacency matrix, with probability $to 1$ as the social group size $N to infty$, every individual in the social group learns the best option. (2) If the minimum degree of the graph diverges as $N to infty$, over an arbitrary but given finite time horizon, the sample paths describing the opinion evolutions of the individuals are asymptotically independent. In addition, the proportions of the population with different opinions converge to the unique solution of a system of ODEs. In the solution of the obtained ODEs, the proportion of the population holding the correct opinion converges to $1$ exponentially fast in time. Notably, our results hold even if the communication graphs are highly sparse.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا