ترغب بنشر مسار تعليمي؟ اضغط هنا

A Polynomial Time Algorithm for Lossy Population Recovery

130   0   0.0 ( 0 )
 نشر من قبل Ankur Moitra
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We give a polynomial time algorithm for the lossy population recovery problem. In this problem, the goal is to approximately learn an unknown distribution on binary strings of length $n$ from lossy samples: for some parameter $mu$ each coordinate of the sample is preserved with probability $mu$ and otherwise is replaced by a `?. The running time and number of samples needed for our algorithm is polynomial in $n$ and $1/varepsilon$ for each fixed $mu>0$. This improves on algorithm of Wigderson and Yehudayoff that runs in quasi-polynomial time for any $mu > 0$ and the polynomial time algorithm of Dvir et al which was shown to work for $mu gtrapprox 0.30$ by Batman et al. In fact, our algorithm also works in the more general framework of Batman et al. in which there is no a priori bound on the size of the support of the distribution. The algorithm we analyze is implicit in previous work; our main contribution is to analyze the algorithm by showing (via linear programming duality and connections to complex analysis) that a certain matrix associated with the problem has a robust local inverse even though its condition number is exponentially small. A corollary of our result is the first polynomial time algorithm for learning DNFs in the restriction access model of Dvir et al.



قيم البحث

اقرأ أيضاً

In list-decodable subspace recovery, the input is a collection of $n$ points $alpha n$ (for some $alpha ll 1/2$) of which are drawn i.i.d. from a distribution $mathcal{D}$ with a isotropic rank $r$ covariance $Pi_*$ (the emph{inliers}) and the rest a re arbitrary, potential adversarial outliers. The goal is to recover a $O(1/alpha)$ size list of candidate covariances that contains a $hat{Pi}$ close to $Pi_*$. Two recent independent works (Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for this problem. These results, however, obtain an error that grows with the dimension (linearly in [RY] and logarithmically in BK) at the cost of quasi-polynomial running time) and rely on emph{certifiable anti-concentration} - a relatively strict condition satisfied essentially only by the Gaussian distribution. In this work, we improve on these results on all three fronts: emph{dimension-independent} error via a faster fixed-polynomial running time under less restrictive distributional assumptions. Specifically, we give a $poly(1/alpha) d^{O(1)}$ time algorithm that outputs a list containing a $hat{Pi}$ satisfying $|hat{Pi} -Pi_*|_F leq O(1/alpha)$. Our result only needs $mathcal{D}$ to have emph{certifiably hypercontractive} degree 2 polynomials. As a result, in addition to Gaussians, our algorithm applies to the uniform distribution on the hypercube and $q$-ary cubes and arbitrary product distributions with subgaussian marginals. Prior work (Raghavendra and Yau, 2020) had identified such distributions as potential hard examples as such distributions do not exhibit strong enough anti-concentration. When $mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger error guarantee of $|hat{Pi}-Pi_*|_F leq eta$ for any arbitrary $eta > 0$ in $d^{O(poly(1/alpha) + log (1/eta))}$ time.
We study the problem of computing the maximum likelihood estimator (MLE) of multivariate log-concave densities. Our main result is the first computationally efficient algorithm for this problem. In more detail, we give an algorithm that, on input a s et of $n$ points in $mathbb{R}^d$ and an accuracy parameter $epsilon>0$, it runs in time $text{poly}(n, d, 1/epsilon)$, and outputs a log-concave density that with high probability maximizes the log-likelihood up to an additive $epsilon$. Our approach relies on a natural convex optimization formulation of the underlying problem that can be efficiently solved by a projected stochastic subgradient method. The main challenge lies in showing that a stochastic subgradient of our objective function can be efficiently approximated. To achieve this, we rely on structural results on approximation of log-concave densities and leverage classical algorithmic tools on volume approximation of convex bodies and uniform sampling from convex sets.
We consider the problem of computing the maximum likelihood multivariate log-concave distribution for a set of points. Specifically, we present an algorithm which, given $n$ points in $mathbb{R}^d$ and an accuracy parameter $epsilon>0$, runs in time $poly(n,d,1/epsilon),$ and returns a log-concave distribution which, with high probability, has the property that the likelihood of the $n$ points under the returned distribution is at most an additive $epsilon$ less than the maximum likelihood that could be achieved via any log-concave distribution. This is the first computationally efficient (polynomial time) algorithm for this fundamental and practically important task. Our algorithm rests on a novel connection with exponential families: the maximum likelihood log-concave distribution belongs to a class of structured distributions which, while not an exponential family, locally possesses key properties of exponential families. This connection then allows the problem of computing the log-concave maximum likelihood distribution to be formulated as a convex optimization problem, and solved via an approximate first-order method. Efficiently approximating the (sub) gradients of the objective function of this optimization problem is quite delicate, and is the main technical challenge in this work.
We study the optimization version of the equal cardinality set partition problem (where the absolute difference between the equal sized partitions sums are minimized). While this problem is NP-hard and requires exponential complexity to solve in gene ral, we have formulated a weaker version of this NP-hard problem, where the goal is to find a locally optimal solution. The local optimality considered in our work is under any swap between the opposing partitions element pairs. To this end, we designed an algorithm which can produce such a locally optimal solution in $O(N^2)$ time and $O(N)$ space. Our approach does not require positive or integer inputs and works equally well under arbitrary input precisions. Thus, it is widely applicable in different problem scenarios.
For graphs $G$ and $H$, we say that $G$ is $H$-free if it does not contain $H$ as an induced subgraph. Already in the early 1980s Alekseev observed that if $H$ is connected, then the textsc{Max Weight Independent Set} problem (MWIS) remains textsc{NP }-hard in $H$-free graphs, unless $H$ is a path or a subdivided claw, i.e., a graph obtained from the three-leaf star by subdividing each edge some number of times (possibly zero). Since then determining the complexity of MWIS in these remaining cases is one of the most important problems in algorithmic graph theory. A general belief is that the problem is polynomial-time solvable, which is witnessed by algorithmic results for graphs excluding some small paths or subdivided claws. A more conclusive evidence was given by the recent breakthrough result by Gartland and Lokshtanov [FOCS 2020]: They proved that MWIS can be solved in quasipolynomial time in $H$-free graphs, where $H$ is any fixed path. If $H$ is an arbitrary subdivided claw, we know much less: The problem admits a QPTAS and a subexponential-time algorithm [Chudnovsky et al., SODA 2019]. In this paper we make an important step towards solving the problem by showing that for any subdivided claw $H$, MWIS is polynomial-time solvable in $H$-free graphs of bounded degree.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا