ﻻ يوجد ملخص باللغة العربية
Decentralized nonconvex optimization has received increasing attention in recent years in machine learning due to its advantages in system robustness, data privacy, and implementation simplicity. However, three fundamental challenges in designing decentralized optimization algorithms are how to reduce their sample, communication, and memory complexities. In this paper, we propose a underline{g}radient-underline{t}racking-based underline{sto}chastic underline{r}ecursive underline{m}omentum (GT-STORM) algorithm for efficiently solving nonconvex optimization problems. We show that to reach an $epsilon^2$-stationary solution, the total number of sample evaluations of our algorithm is $tilde{O}(m^{1/2}epsilon^{-3})$ and the number of communication rounds is $tilde{O}(m^{-1/2}epsilon^{-3})$, which improve the $O(epsilon^{-4})$ costs of sample evaluations and communications for the existing decentralized stochastic gradient algorithms. We conduct extensive experiments with a variety of learning models, including non-convex logistical regression and convolutional neural networks, to verify our theoretical findings. Collectively, our results contribute to the state of the art of theories and algorithms for decentralized network optimization.
Decentralized federated learning (DFL) is a powerful framework of distributed machine learning and decentralized stochastic gradient descent (SGD) is a driving engine for DFL. The performance of decentralized SGD is jointly influenced by communicatio
One popular trend in meta-learning is to learn from many training tasks a common initialization for a gradient-based method that can be used to solve a new task with few samples. The theory of meta-learning is still in its early stages, with several
Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs and the servers update d
Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benc
Federated learning (FL) is a prevailing distributed learning paradigm, where a large number of workers jointly learn a model without sharing their training data. However, high communication costs could arise in FL due to large-scale (deep) learning m