بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Finding Skewed Subcubes Under a Distribution

100 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Roie Levin

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Parikshit Gopalan - Roie Levin - Udi Wieder

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي الاحتمالات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Say that we are given samples from a distribution $psi$ over an $n$-dimensional space. We expect or desire $psi$ to behave like a product distribution (or a $k$-wise independent distribution over its marginals for small $k$). We propose the problem of enumerating/list-decoding all large subcubes where the distribution $psi$ deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in $psi$. We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection. In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a minimal skewed subcube, which are subcubes whose skew cannot be attributed to a larger subcube that contains it. Our main technical contribution is a list-size bound for this definition and an algorithm to efficiently find all such subcubes. Both the bound and the algorithm rely on Fourier-analytic techniques, especially the powerful hypercontractive inequality. On the lower bounds side, we show that finding skewed subcubes is as hard as the sparse noisy parity problem, and hence our algorithms cannot be improved on substantially without a breakthrough on this problem which is believed to be intractable. Motivated by this, we study alternate models allowing query access to $psi$ where finding skewed subcubes might be easier.

قيم البحث

153 - Jessica Enright , Duncan Lee , Kitty Meeks 2020

Understanding spatial correlation is vital in many fields including epidemiology and social science. Lee, Meeks and Pettersson (Stat. Comput. 2021) recently demonstrated that improved inference for areal unit count data can be achieved by carrying ou t modifications to a graph representing spatial correlations; specifically, they delete edges of the planar graph derived from border-sharing between geographic regions in order to maximise a specific objective function. In this paper we address the computational complexity of the associated graph optimisation problem. We demonstrate that this problem cannot be solved in polynomial time unless P = NP; we further show intractability for two simpler variants of the problem. We follow these results with two parameterised algorithms that exactly solve the problem in polynomial time in restricted settings. The first of these utilises dynamic programming on a tree decomposition, and runs in polynomial time if both the treewidth and maximum degree are bounded. The second algorithm is restricted to problem instances with maximum degree three, as may arise from triangulations of planar surfaces, but is an FPT algorithm when the maximum number of edges that can be removed is taken as the parameter.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي

On the complexity of finding internally vertex-disjoint long directed paths

97 - Julio Araujo , Victor A. Campos , Ana Karolinna Maia 2017

For two positive integers $k$ and $ell$, a $(k times ell)$-spindle is the union of $k$ pairwise internally vertex-disjoint directed paths with $ell$ arcs between two vertices $u$ and $v$. We are interested in the (parameterized) complexity of several problems consisting in deciding whether a given digraph contains a subdivision of a spindle, which generalize both the Maximum Flow and Longest Path problems. We obtain the following complexity dichotomy: for a fixed $ell geq 1$, finding the largest $k$ such that an input digraph $G$ contains a subdivision of a $(k times ell)$-spindle is polynomial-time solvable if $ell leq 3$, and NP-hard otherwise. We place special emphasis on finding spindles with exactly two paths and present FPT algorithms that are asymptotically optimal under the ETH. These algorithms are based on the technique of representative families in matroids, and use also color-coding as a subroutine. Finally, we study the case where the input graph is acyclic, and present several algorithmic and hardness results.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي

Parameterized Complexity of Finding Subgraphs with Hereditary Properties on Hereditary Graph Classes

99 - David Eppstein , Siddharth Gupta , Elham Havvaei 2021

We investigate the parameterized complexity of finding subgraphs with hereditary properties on graphs belonging to a hereditary graph class. Given a graph $G$, a non-trivial hereditary property $Pi$ and an integer parameter $k$, the general problem $ P(G,Pi,k)$ asks whether there exists $k$ vertices of $G$ that induce a subgraph satisfying property $Pi$. This problem, $P(G,Pi,k)$ has been proved to be NP-complete by Lewis and Yannakakis. The parameterized complexity of this problem is shown to be W[1]-complete by Khot and Raman, if $Pi$ includes all trivial graphs but not all complete graphs and vice versa; and is fixed-parameter tractable (FPT), otherwise. As the problem is W[1]-complete on general graphs when $Pi$ includes all trivial graphs but not all complete graphs and vice versa, it is natural to further investigate the problem on restricted graph classes. Motivated by this line of research, we study the problem on graphs which also belong to a hereditary graph class and establish a framework which settles the parameterized complexity of the problem for various hereditary graph classes. In particular, we show that: $P(G,Pi,k)$ is solvable in polynomial time when the graph $G$ is co-bipartite and $Pi$ is the property of being planar, bipartite or triangle-free (or vice-versa). $P(G,Pi,k)$ is FPT when the graph $G$ is planar, bipartite or triangle-free and $Pi$ is the property of being planar, bipartite or triangle-free, or graph $G$ is co-bipartite and $Pi$ is the property of being co-bipartite. $P(G,Pi,k)$ is W[1]-complete when the graph $G$ is $C_4$-free, $K_{1,4}$-free or a unit disk graph and $Pi$ is the property of being either planar or bipartite.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي

Feedback Vertex Set and Even Cycle Transversal for H-Free Graphs: Finding Large Block Graphs

155 - Giacomo Paesani , Daniel Paulusma , Pawe{l} Rzk{a}.zewski 2021

We prove new complexity results for Feedback Vertex Set and Even Cycle Transversal on $H$-free graphs, that is, graphs that do not contain some fixed graph $H$ as an induced subgraph. In particular, we prove that both problems are polynomial-time sol vable for $sP_3$-free graphs for every integer $sgeq 1$. Our results show that both problems exhibit the same behaviour on $H$-free graphs (subject to some open cases). This is in part explained by a new general algorithm we design for finding in a graph $G$ a largest induced subgraph whose blocks belong to some finite class ${cal C}$ of graphs. We also compare our results with the state-of-the-art results for the Odd Cycle Transversal problem, which is known to behave differently on $H$-free graphs.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي الرياضيات المتقطعة

(Learned) Frequency Estimation Algorithms under Zipfian Distribution

342 - Anders Aamand , Piotr Indyk , Ali Vakilian 2019

begin{abstract} The frequencies of the elements in a data stream are an important statistical measure and the task of estimating them arises in many applications within data analysis and machine learning. Two of the most popular algorithms for this p roblem, Count-Min and Count-Sketch, are widely used in practice. In a recent work [Hsu et al., ICLR19], it was shown empirically that augmenting Count-Min and Count-Sketch with a machine learning algorithm leads to a significant reduction of the estimation error. The experiments were complemented with an analysis of the expected error incurred by Count-Min (both the standard and the augmented version) when the input frequencies follow a Zipfian distribution. Although the authors established that the learned version of Count-Min has lower estimation error than its standard counterpart, their analysis of the standard Count-Min algorithm was not tight. Moreover, they provided no similar analysis for Count-Sketch. In this paper we resolve these problems. First, we provide a simple tight analysis of the expected error incurred by Count-Min. Second, we provide the first error bounds for both the standard and the augmented version of Count-Sketch. These bounds are nearly tight and again demonstrate an improved performance of the learned version of Count-Sketch. In addition to demonstrating tight gaps between the aforementioned algorithms, we believe that our bounds for the standa

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الإسلامية في لبنان

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Finding Skewed Subcubes Under a Distribution

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً