No Arabic abstract
Let $G$ be a DAG with $n$ vertices and $m$ edges. Two vertices $u,v$ are incomparable if $u$ doesnt reach $v$ and vice versa. We denote by emph{width} of a DAG $G$, $w_G$, the maximum size of a set of incomparable vertices of $G$. In this paper we present an algorithm that computes a dominance drawing of a DAG G in $k$ dimensions, where $w_G le k le frac{n}{2}$. The time required by the algorithm is $O(kn)$, with a precomputation time of $O(km)$, needed to compute a emph{compressed transitive closure} of $G$, and extra $O(n^2w_G)$ or $O(n^3)$ time, if we want $k=w_G$. Our algorithm gives a tighter bound to the dominance dimension of a DAG. As corollaries, a new family of graphs having a 2-dimensional dominance drawing and a new upper bound to the dimension of a partial order are obtained. We also introduce the concept of transitive module and dimensional neck, $w_N$, of a DAG $G$ and we show how to improve the results given previously using these concepts.
We consider the problem of placing arrow heads in directed graph drawings without them overlapping other drawn objects. This gives drawings where edge directions can be deduced unambiguously. We show hardness of the problem, present exact and heuristic algorithms, and report on a practical study.
We investigate the problem of identity testing for multidimensional histogram distributions. A distribution $p: D rightarrow mathbb{R}_+$, where $D subseteq mathbb{R}^d$, is called a $k$-histogram if there exists a partition of the domain into $k$ axis-aligned rectangles such that $p$ is constant within each such rectangle. Histograms are one of the most fundamental nonparametric families of distributions and have been extensively studied in computer science and statistics. We give the first identity tester for this problem with {em sub-learning} sample complexity in any fixed dimension and a nearly-matching sample complexity lower bound. In more detail, let $q$ be an unknown $d$-dimensional $k$-histogram distribution in fixed dimension $d$, and $p$ be an explicitly given $d$-dimensional $k$-histogram. We want to correctly distinguish, with probability at least $2/3$, between the case that $p = q$ versus $|p-q|_1 geq epsilon$. We design an algorithm for this hypothesis testing problem with sample complexity $O((sqrt{k}/epsilon^2) 2^{d/2} log^{2.5 d}(k/epsilon))$ that runs in sample-polynomial time. Our algorithm is robust to model misspecification, i.e., succeeds even if $q$ is only promised to be {em close} to a $k$-histogram. Moreover, for $k = 2^{Omega(d)}$, we show a sample complexity lower bound of $(sqrt{k}/epsilon^2) cdot Omega(log(k)/d)^{d-1}$ when $dgeq 2$. That is, for any fixed dimension $d$, our upper and lower bounds are nearly matching. Prior to our work, the sample complexity of the $d=1$ case was well-understood, but no algorithm with sub-learning sample complexity was known, even for $d=2$. Our new upper and lower bounds have interesting conceptual implications regarding the relation between learning and testing in this setting.
We study the problem of a seller dynamically pricing $d$ distinct types of indivisible goods, when faced with the online arrival of unit-demand buyers drawn independently from an unknown distribution. The goods are not in limited supply, but can only be produced at a limited rate and are costly to produce. The seller observes only the bundle of goods purchased at each day, but nothing else about the buyers valuation function. Our main result is a dynamic pricing algorithm for optimizing welfare (including the sellers cost of production) that runs in time and a number of rounds that are polynomial in $d$ and the approximation parameter. We are able to do this despite the fact that (i) the price-response function is not continuous, and even its fractional relaxation is a non-concave function of the prices, and (ii) the welfare is not observable to the seller. We derive this result as an application of a general technique for optimizing welfare over emph{divisible} goods, which is of independent interest. When buyers have strongly concave, Holder continuous valuation functions over $d$ divisible goods, we give a general polynomial time dynamic pricing technique. We are able to apply this technique to the setting of unit demand buyers despite the fact that in that setting the goods are not divisible, and the natural fractional relaxation of a unit demand valuation is not strongly concave. In order to apply our general technique, we introduce a novel price randomization procedure which has the effect of implicitly inducing buyers to regularize their valuations with a strongly concave function. Finally, we also extend our results to a limited-supply setting in which the number of copies of each good cannot be replenished.
Multidimensional packing problems generalize the classical packing problems such as Bin Packing, Multiprocessor Scheduling by allowing the jobs to be $d$-dimensional vectors. While the approximability of the scalar problems is well understood, there has been a significant gap between the approximation algorithms and the hardness results for the multidimensional variants. In this paper, we close this gap by giving almost tight hardness results for these problems. 1. We show that Vector Bin Packing has no polynomial time $Omega( log d)$ factor asymptotic approximation algorithm when $d$ is a large constant, assuming $textsf{P} eq textsf{NP}$. This matches the $ln d + O(1)$ factor approximation algorithms (Chekuri, Khanna SICOMP 2004, Bansal, Caprara, Sviridenko SICOMP 2009, Bansal, Eli{a}s, Khan SODA 2016) upto constants. 2. We show that Vector Scheduling has no polynomial time algorithm with an approximation ratio of $Omegaleft( (log d)^{1-epsilon}right)$ when $d$ is part of the input, assuming $textsf{NP} subseteq textsf{ZPTIME}left( n^{(log n)^{O(1)}}right)$. This almost matches the $Oleft( frac{log d}{log log d}right)$ factor algorithms(Harris, Srinivasan JACM 2019, Im, Kell, Kulkarni, Panigrahi SICOMP 2019). We also show that the problem is NP-hard to approximate within $(log log d)^{omega(1)}$. 3. We show that Vector Bin Covering is NP-hard to approximate within $Omegaleft( frac{log d}{log log d}right)$ when $d$ is part of the input, almost matching the $O(log d)$ factor algorithm (Alon et al., Algorithmica 1998). Previously, no hardness results that grow with $d$ were known for Vector Scheduling and Vector Bin Covering when $d$ is part of the input and for Vector Bin Packing when $d$ is a fixed constant.
In this paper we present novel algorithms for several multidimensional data processing problems. We consider problems related to the computation of restricted clusters and of the diameter of a set of points using a new distance function. We also consider two string (1D data) processing problems, regarding an optimal encoding method and the computation of the number of occurrences of a substring within a string generated by a grammar. The algorithms have been thoroughly analyzed from a theoretical point of view and some of them have also been evaluated experimentally.