ﻻ يوجد ملخص باللغة العربية
Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of disagreements. In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. In particular, our algorithm uses machines with memory sublinear in the number of nodes in the graph and returns a constant approximation while running only for a constant number of rounds. To the best of our knowledge, our algorithm is the first that can provably approximate a clustering problem on graphs using only a constant number of MPC rounds in the sublinear memory regime. We complement our analysis with an experimental analysis of our techniques.
We study graph connectivity problem in MPC model. On an undirected graph with $n$ nodes and $m$ edges, $O(log n)$ round connectivity algorithms have been known for over 35 years. However, no algorithms with better complexity bounds were known. In thi
We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC) model, wh
Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms need to be
We show that the ratio of matched individuals to blocking pairs grows linearly with the number of propose--accept rounds executed by the Gale--Shapley algorithm for the stable marriage problem. Consequently, the participants can arrive at an almost s
In this paper we present t