No Arabic abstract
In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points often belong to the same or similar classes, and more generally, classification rules of high accuracy tend to be locally simple but globally complex (Vapnik & Bottou 1993), we propose data dependent dispatching that takes advantage of such structure. We present an in-depth analysis of this model, providing new algorithms with provable worst-case guarantees, analysis proving existing scalable heuristics perform well in natural non worst-case conditions, and techniques for extending a dispatching rule from a small sample to the entire distribution. We overcome novel technical challenges to satisfy important conditions for accurate distributed learning, including fault tolerance and balancedness. We empirically compare our approach with baselines based on random partitioning, balanced partition trees, and locality sensitive hashing, showing that we achieve significantly higher accuracy on both synthetic and real world image and advertising datasets. We also demonstrate that our technique strongly scales with the available computing power.
We consider the problem of sequentially allocating resources in a censored semi-bandits setup, where the learner allocates resources at each step to the arms and observes loss. The loss depends on two hidden parameters, one specific to the arm but independent of the resource allocation, and the other depends on the allocated resource. More specifically, the loss equals zero for an arm if the resource allocated to it exceeds a constant (but unknown) arm dependent threshold. The goal is to learn a resource allocation that minimizes the expected loss. The problem is challenging because the loss distribution and threshold value of each arm are unknown. We study this setting by establishing its `equivalence to Multiple-Play Multi-Armed Bandits (MP-MAB) and Combinatorial Semi-Bandits. Exploiting these equivalences, we derive optimal algorithms for our problem setting using known algorithms for MP-MAB and Combinatorial Semi-Bandits. The experiments on synthetically generated data validate the performance guarantees of the proposed algorithms.
In several smart city applications, multiple resources must be allocated among competing agents that are coupled through such shared resources and are constrained --- either through limitations of communication infrastructure or privacy considerations. We propose a distributed algorithm to solve such distributed multi-resource allocation problems with no direct inter-agent communication. We do so by extending a recently introduced additive-increase multiplicative-decrease (AIMD) algorithm, which only uses very little communication between the system and agents. Namely, a control unit broadcasts a one-bit signal to agents whenever one of the allocated resources exceeds capacity. Agents then respond to this signal in a probabilistic manner. In the proposed algorithm, each agent makes decision of its resource demand locally and an agent is unaware of the resource allocation of other agents. In empirical results, we observe that the average allocations converge over time to optimal allocations.
In the standard Mechanism Design framework, agents messages are gathered at a central point and allocation/tax functions are calculated in a centralized manner, i.e., as functions of all network agents messages. This requirement may cause communication and computation overhead and necessitates the design of mechanisms that alleviate this bottleneck. We consider a scenario where message transmission can only be performed locally so that the mechanism allocation/tax functions can be calculated in a decentralized manner. Each agent transmits messages to her local neighborhood, as defined by a given message-exchange network, and her allocation/tax functions are only functions of the available neighborhood messages. This scenario gives rise to a novel research problem that we call Distributed Mechanism Design. In this paper, we propose two distributed mechanisms for network utility maximization problems that involve private and public goods with competition and cooperation between agents. As a concrete example, we use the problems of rate allocation in networks with either unicast or multirate multicast transmission protocols. The proposed mechanism for each of the protocols fully implements the optimal allocation in Nash equilibria and its message space dimensionality scales linearly with respect to the number of agents in the network.
Network slicing has been considered as one of the key enablers for 5G to support diversified services and application scenarios. This paper studies the distributed network slicing utilizing both the spectrum resource offered by communication network and computational resources of a coexisting fog computing network. We propose a novel distributed framework based on a new control plane entity, regional orchestrator (RO), which can be deployed between base stations (BSs) and fog nodes to coordinate and control their bandwidth and computational resources. We propose a distributed resource allocation algorithm based on Alternating Direction Method of Multipliers with Partial Variable Splitting (DistADMM-PVS). We prove that the proposed algorithm can minimize the average latency of the entire network and at the same time guarantee satisfactory latency performance for every supported type of service. Simulation results show that the proposed algorithm converges much faster than some other existing algorithms. The joint network slicing with both bandwidth and computational resources can offer around 15% overall latency reduction compared to network slicing with only a single resource.
We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this bottleneck through compression of gradient updates. However, as models become larger, so does the size of the gradient updates. In this paper, we propose an alternate approach to learn from distributed data that quantizes data instead of gradients, and can support learning over applications where the size of gradient updates is prohibitive. Our approach leverages the dependency of the computed gradient on data samples, which lie in a much smaller space in order to perform the quantization in the smaller dimension data space. At the cost of an extra gradient computation, the gradient estimate can be refined by conveying the difference between the gradient at the quantized data point and the original gradient using a small number of bits. Lastly, in order to save communication, our approach adds a layer that decides whether to transmit a quantized data sample or not based on its importance for learning. We analyze the convergence of the proposed approach for smooth convex and non-convex objective functions and show that we can achieve order optimal convergence rates with communication that mostly depends on the data rather than the model (gradient) dimension. We use our proposed algorithm to train ResNet models on the CIFAR-10 and ImageNet datasets, and show that we can achieve an order of magnitude savings over gradient compression methods. These communication savings come at the cost of increasing computation at the learning agent, and thus our approach is beneficial in scenarios where communication load is the main problem.