No Arabic abstract
Locally checkable labeling problems (LCLs) are distributed graph problems in which a solution is globally feasible if it is locally feasible in all constant-radius neighborhoods. Vertex colorings, maximal independent sets, and maximal matchings are examples of LCLs. On the one hand, it is known that some LCLs benefit exponentially from randomness---for example, any deterministic distributed algorithm that finds a sinkless orientation requires $Theta(log n)$ rounds in the LOCAL model, while the randomized complexity of the problem is $Theta(log log n)$ rounds. On the other hand, there are also many LCLs in which randomness is useless. Previously, it was not known if there are any LCLs that benefit from randomness, but only subexponentially. We show that such problems exist: for example, there is an LCL with deterministic complexity $Theta(log^2 n)$ rounds and randomized complexity $Theta(log n log log n)$ rounds.
Consider any locally checkable labeling problem $Pi$ in rooted regular trees: there is a finite set of labels $Sigma$, and for each label $x in Sigma$ we specify what are permitted label combinations of the children for an internal node of label $x$ (the leaf nodes are unconstrained). This formalism is expressive enough to capture many classic problems studied in distributed computing, including vertex coloring, edge coloring, and maximal independent set. We show that the distributed computational complexity of any such problem $Pi$ falls in one of the following classes: it is $O(1)$, $Theta(log^* n)$, $Theta(log n)$, or $n^{Theta(1)}$ rounds in trees with $n$ nodes (and all of these classes are nonempty). We show that the complexity of any given problem is the same in all four standard models of distributed graph algorithms: deterministic $mathsf{LOCAL}$, randomized $mathsf{LOCAL}$, deterministic $mathsf{CONGEST}$, and randomized $mathsf{CONGEST}$ model. In particular, we show that randomness does not help in this setting, and the complexity class $Theta(log log n)$ does not exist (while it does exist in the broader setting of general trees). We also show how to systematically determine the complexity class of any such problem $Pi$, i.e., whether $Pi$ takes $O(1)$, $Theta(log^* n)$, $Theta(log n)$, or $n^{Theta(1)}$ rounds. While the algorithm may take exponential time in the size of the description of $Pi$, it is nevertheless practical: we provide a freely available implementation of the classifier algorithm, and it is fast enough to classify many problems of interest.
Mixup is a popular data augmentation technique based on taking convex combinations of pairs of examples and their labels. This simple technique has been shown to substantially improve both the robustness and the generalization of the trained model. However, it is not well-understood why such improvement occurs. In this paper, we provide theoretical analysis to demonstrate how using Mixup in training helps model robustness and generalization. For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss. This explains why models obtained by Mixup training exhibits robustness to several kinds of adversarial attacks such as Fast Gradient Sign Method (FGSM). For generalization, we prove that Mixup augmentation corresponds to a specific type of data-adaptive regularization which reduces overfitting. Our analysis provides new insights and a framework to understand Mixup.
We unveil the connections between Frank Wolfe (FW) type algorithms and the momentum in Accelerated Gradient Methods (AGM). On the negative side, these connections illustrate why momentum is unlikely to be effective for FW type algorithms. The encouraging message behind this link, on the other hand, is that momentum is useful for FW on a class of problems. In particular, we prove that a momentum variant of FW, that we term accelerated Frank Wolfe (AFW), converges with a faster rate $tilde{cal O}(frac{1}{k^2})$ on certain constraint sets despite the same ${cal O}(frac{1}{k})$ rate as FW on general cases. Given the possible acceleration of AFW at almost no extra cost, it is thus a competitive alternative to FW. Numerical experiments on benchmarked machine learning tasks further validate our theoretical findings.
A rich line of work has been addressing the computational complexity of locally checkable labelings (LCLs), illustrating the landscape of possible complexities. In this paper, we study the landscape of LCL complexities under bandwidth restrictions. Our main results are twofold. First, we show that on trees, the CONGEST complexity of an LCL problem is asymptotically equal to its complexity in the LOCAL model. An analog statement for general (non-LCL) problems is known to be false. Second, we show that for general graphs this equivalence does not hold, by providing an LCL problem for which we show that it can be solved in $O(log n)$ rounds in the LOCAL model, but requires $tilde{Omega}(n^{1/2})$ rounds in the CONGEST model.
This paper investigates the power of preprocessing in the CONGEST model. Schmid and Suomela (ACM HotSDN 2013) introduced the SUPPORTED CONGEST model to study the application of distributed algorithms in Software-Defined Networks (SDNs). In this paper, we show that a large class of lower bounds in the CONGEST model still hold in the SUPPORTED model, highlighting the robustness of these bounds. This also raises the question how much does preprocessing help in the CONGEST model.