Do you want to publish a course? Click here

Self-Correlation and Maximum Independence in Finite Relations

70   0   0.0 ( 0 )
 Added by EPTCS
 Publication date 2015
and research's language is English
 Authors Dilian Gurov




Ask ChatGPT about the research

We consider relations with no order on their attributes as in Database Theory. An independent partition of the set of attributes S of a finite relation R is any partition X of S such that the join of the projections of R over the elements of X yields R. Identifying independent partitions has many applications and corresponds conceptually to revealing orthogonality between sets of dimensions in multidimensional point spaces. A subset of S is termed self-correlated if there is a value of each of its attributes such that no tuple of R contains all those values. This paper uncovers a connection between independence and self-correlation, showing that the maximum independent partition is the least fixed point of a certain inflationary transformer alpha that operates on the finite lattice of partitions of S. alpha is defined via the minimal self-correlated subsets of S. We use some additional properties of alpha to show the said fixed point is still the limit of the standard approximation sequence, just as in Kleenes well-known fixed point theorem for continuous functions.



rate research

Read More

80 - Cencheng Shen 2020
A number of universally consistent dependence measures have been recently proposed for testing independence, such as distance correlation, kernel correlation, multiscale graph correlation, etc. They provide a satisfactory solution for dependence testing in low-dimensions, but often exhibit decreasing power for high-dimensional data, a phenomenon that has been recognized but remains mostly unchartered. In this paper, we aim to better understand the high-dimensional testing scenarios and explore a procedure that is robust against increasing dimension. To that end, we propose the maximum marginal correlation method and characterize high-dimensional dependence structures via the notion of dependent dimensions. We prove that the maximum method can be valid and universally consistent for testing high-dimensional dependence under regularity conditions, and demonstrate when and how the maximum method may outperform other methods. The methodology can be implemented by most existing dependence measures, has a superior testing power in a variety of common high-dimensional settings, and is computationally efficient for big data analysis when using the distance correlation chi-square test.
In this work we introduce a notion of independence based on finite-state automata: two infinite words are independent if no one helps to compress the other using one-to-one finite-state transducers with auxiliary input. We prove that, as expected, the set of independent pairs of infinite words has Lebesgue measure 1. We show that the join of two independent normal words is normal. However, the independence of two normal words is not guaranteed if we just require that their join is normal. To prove this we construct a normal word $x_1x_2x_3ldots$ where $x_{2n}=x_n$ for every $n$.
The combinatorics of squares in a word depends on how the equivalence of halves of the square is defined. We consider Abelian squares, parameterized squares, and order-preserving squares. The word $uv$ is an Abelian (parameterized, order-preserving) square if $u$ and $v$ are equivalent in the Abelian (parameterized, order-preserving) sense. The maximum number of ordinary squares in a word is known to be asymptotically linear, but the exact bound is still investigated. We present several results on the maximum number of distinct squares for nonstandard subword equivalence relations. Let $mathit{SQ}_{mathrm{Abel}}(n,sigma)$ and $mathit{SQ}_{mathrm{Abel}}(n,sigma)$ denote the maximum number of Abelian squares in a word of length $n$ over an alphabet of size $sigma$, which are distinct as words and which are nonequivalent in the Abelian sense, respectively. For $sigmage 2$ we prove that $mathit{SQ}_{mathrm{Abel}}(n,sigma)=Theta(n^2)$, $mathit{SQ}_{mathrm{Abel}}(n,sigma)=Omega(n^{3/2})$ and $mathit{SQ}_{mathrm{Abel}}(n,sigma) = O(n^{11/6})$. We also give linear bounds for parameterized and order-preserving squares for alphabets of constant size: $mathit{SQ}_{mathrm{param}}(n,O(1))=Theta(n)$, $mathit{SQ}_{mathrm{op}}(n,O(1))=Theta(n)$. The upper bounds have quadratic dependence on the alphabet size for order-preserving squares and exponential dependence for parameterized squares. As a side result we construct infinite words over the smallest alphabet which avoid nontrivial order-preserving squares and nontrivial parameterized cubes (nontrivial parameterized squares cannot be avoided in an infinite word).
Multiple interval graphs are variants of interval graphs where instead of a single interval, each vertex is assigned a set of intervals on the real line. We study the complexity of the MAXIMUM CLIQUE problem in several classes of multiple interval graphs. The MAXIMUM CLIQUE problem, or the problem of finding the size of the maximum clique, is known to be NP-complete for $t$-interval graphs when $tgeq 3$ and polynomial-time solvable when $t=1$. The problem is also known to be NP-complete in $t$-track graphs when $tgeq 4$ and polynomial-time solvable when $tleq 2$. We show that MAXIMUM CLIQUE is already NP-complete for unit 2-interval graphs and unit 3-track graphs. Further, we show that the problem is APX-complete for 2-interval graphs, 3-track graphs, unit 3-interval graphs and unit 4-track graphs. We also introduce two new classes of graphs called $t$-circular interval graphs and $t$-circular track graphs and study the complexity of the MAXIMUM CLIQUE problem in them. On the positive side, we present a polynomial time $t$-approximation algorithm for WEIGHTED MAXIMUM CLIQUE on $t$-interval graphs, improving earlier work with approximation ratio $4t$.
This paper aims at comparing two coupling approaches as basic layers for building clustering criteria, suited for modularizing and clustering very large networks. We briefly use optimal transport theory as a starting point, and a way as well, to derive two canonical couplings: statistical independence and logical indetermination. A symmetric list of properties is provided and notably the so called Monges properties, applied to contingency matrices, and justifying the $otimes$ versus $oplus$ notation. A study is proposed, highlighting logical indetermination, because it is, by far, lesser known. Eventually we estimate the average difference between both couplings as the key explanation of their usually close results in network clustering.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا