Do you want to publish a course? Click here

Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

162   0   0.0 ( 0 )
 Added by Jess Banks
 Publication date 2016
  fields Physics
and research's language is English




Ask ChatGPT about the research

We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these planted models and null models where the signal matrix is zero. Our bounds differ by at most a factor of root two when the rank is large (in the clustering and submatrix localization problems, when the number of clusters or blocks is large) or the signal matrix is very sparse. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information- theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured hard but detectable regime for community detection in sparse graphs.



rate research

Read More

We present a novel framework exploiting the cascade of phase transitions occurring during a simulated annealing of the Expectation-Maximisation algorithm to cluster datasets with multi-scale structures. Using the weighted local covariance, we can extract, a posteriori and without any prior knowledge, information on the number of clusters at different scales together with their size. We also study the linear stability of the iterative scheme to derive the threshold at which the first transition occurs and show how to approximate the next ones. Finally, we combine simulated annealing together with recent developments of regularised Gaussian mixture models to learn a principal graph from spatially structured datasets that can also exhibit many scales.
We study optimal estimation for sparse principal component analysis when the number of non-zero elements is small but on the same order as the dimension of the data. We employ approximate message passing (AMP) algorithm and its state evolution to analyze what is the information theoretically minimal mean-squared error and the one achieved by AMP in the limit of large sizes. For a special case of rank one and large enough density of non-zeros Deshpande and Montanari [1] proved that AMP is asymptotically optimal. We show that both for low density and for large rank the problem undergoes a series of phase transitions suggesting existence of a region of parameters where estimation is information theoretically possible, but AMP (and presumably every other polynomial algorithm) fails. The analysis of the large rank limit is particularly instructive.
We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $mathbf{X}^{star}$ from $m$ (possibly noisy) observations $Y_mu = | sum_{i=1}^n Phi_{mu i} X^{star}_i/sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $mathbf{Phi}$, in a high-dimensional setting where $m,ntoinfty$ while $alpha = m/n=Theta(1)$. First, we derive sharp asymptotics for the lowest possible estimation error achievable statistically and we unveil the existence of sharp phase transitions for the weak- and full-recovery thresholds as a function of the singular values of the matrix $mathbf{Phi}$. This is achieved by providing a rigorous proof of a result first obtained by the replica method from statistical mechanics. In particular, the information-theoretic transition to perfect recovery for full-rank matrices appears at $alpha=1$ (real case) and $alpha=2$ (complex case). Secondly, we analyze the performance of the best-known polynomial time algorithm for this problem -- approximate message-passing -- establishing the existence of a statistical-to-algorithmic gap depending, again, on the spectral properties of $mathbf{Phi}$. Our work provides an extensive classification of the statistical and algorithmic thresholds in high-dimensional phase retrieval for a broad class of random matrices.
A key feature of the many-body localized phase is the breaking of ergodicity and consequently the emergence of local memory; revealed as the local preservation of information over time. As memory is necessarily a time dependent concept, it has been partially captured by a few extant studies of dynamical quantities. However, these quantities are neither optimal, nor democratic with respect to input state; and as such a fundamental and complete information theoretic understanding of local memory in the context of many-body localization remains elusive. We introduce the dynamical Holevo quantity as the true quantifier of local memory, outlining its advantages over other quantities such as the imbalance or entanglement entropy. We find clear scaling behavior in its steady-state across the many-body localization transition, and determine a family of two-parameter scaling ansatze which captures this behavior. We perform a comprehensive finite size scaling analysis of this dynamical quantity extracting the transition point and scaling exponents.
71 - Leo Radzihovsky 1997
We study the shape, elasticity and fluctuations of the recently predicted (cond-mat/9510172) and subsequently observed (in numerical simulations) (cond-mat/9705059) tubule phase of anisotropic membranes, as well as the phase transitions into and out of it. This novel phase lies between the previously predicted flat and crumpled phases, both in temperature and in its physical properties: it is crumpled in one direction, and extended in the other. Its shape and elastic properties are characterized by a radius of gyration exponent $ u$ and an anisotropy exponent $z$. We derive scaling laws for the radius of gyration $R_G(L_perp,L_y)$ (i.e. the average thickness) of the tubule about a spontaneously selected straight axis and for the tubule undulations $h_{rms}(L_perp,L_y)$ transverse to its average extension. For phantom (i.e. non-self-avoiding) membranes, we predict $ u=1/4$, $z=1/2$ and $eta_kappa=0$, exactly, in excellent agreement with simulations. For membranes embedded in the space of dimension $d<11$, self-avoidance greatly swells the tubule and suppresses its wild transverse undulations, changing its shape exponents $ u$ and $z$. We give detailed scaling results for the shape of the tubule of an arbitrary aspect ratio and compute a variety of correlation functions, as well as the anomalous elasticity of the tubules. Finally we present a scaling theory for the shape of the membrane and its specific heat near the continuous transitions into and out of the tubule phase and perform detailed renormalization group calculations for the crumpled-to-tubule transition for phantom membranes.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا