Reconstructing Point Sets from Distance Distributions

162 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Shuai Huang

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Shuai Huang - Ivan Dokmanic

بنى وهياكل البيانات والخوارزميات نظرية المعلومات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We address the problem of reconstructing a set of points on a line or a loop from their unassigned noisy pairwise distances. When the points lie on a line, the problem is known as the turnpike; when they are on a loop, it is known as the beltway. We approximate the problem by discretizing the domain and representing the $N$ points via an $N$-hot encoding, which is a density supported on the discretized domain. We show how the distance distribution is then simply a collection of quadratic functionals of this density and propose to recover the point locations so that the estimated distance distribution matches the measured distance distribution. This can be cast as a constrained nonconvex optimization problem which we solve using projected gradient descent with a suitable spectral initializer. We derive conditions under which the proposed distance distribution matching approach locally converges to a global optimizer at a linear rate. Compared to the conventional backtracking approach, our method jointly reconstructs all the point locations and is robust to noise in the measurements. We substantiate these claims with state-of-the-art performance across a number of numerical experiments. Our method is the first practical approach to solve the large-scale noisy beltway problem where the points lie on a loop.

قيم البحث

135 - Ellen D. Zhong , Tristan Bepler , Joseph H. Davis 2019

Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structure of proteins and other macromolecular complexes at near-atomic resolution. In single particle cryo-EM, the central problem is to reconstruct the three-dimensional structure of a macromolecule from $10^{4-7}$ noisy and randomly oriented two-dimensional projections. However, the imaged protein complexes may exhibit structural variability, which complicates reconstruction and is typically addressed using discrete clustering approaches that fail to capture the full range of protein dynamics. Here, we introduce a novel method for cryo-EM reconstruction that extends naturally to modeling continuous generative factors of structural heterogeneity. This method encodes structures in Fourier space using coordinate-based deep neural networks, and trains these networks from unlabeled 2D cryo-EM images by combining exact inference over image orientation with variational inference for structural heterogeneity. We demonstrate that the proposed method, termed cryoDRGN, can perform ab initio reconstruction of 3D protein complexes from simulated and real 2D cryo-EM image data. To our knowledge, cryoDRGN is the first neural network-based approach for cryo-EM reconstruction and the first end-to-end method for directly reconstructing continuous ensembles of protein structures from cryo-EM images.

الأساليب الكمية الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Reconstructing Galaxy Spectral Energy Distributions from Broadband Photometry

58 - I. Csabai 1999

We present a novel approach to photometric redshifts, one that merges the advantages of both the template fitting and empirical fitting algorithms, without any of their disadvantages. This technique derives a set of templates, describing the spectral energy distributions of galaxies, from a catalog with both multicolor photometry and spectroscopic redshifts. The algorithm is essentially using the shapes of the templates as the fitting parameters. From simulated multicolor data we show that for a small training set of galaxies we can reconstruct robustly the underlying spectral energy distributions even in the presence of substantial errors in the photometric observations. We apply these techniques to the multicolor and spectroscopic observations of the Hubble Deep Field building a set of template spectra that reproduced the observed galaxy colors to better than 10%. Finally we demonstrate that these improved spectral energy distributions lead to a photometric-redshift relation for the Hubble Deep Field that is more accurate than standard template-based approaches.

Reconstructing faces from voices

77 - Yandong Wen , Rita Singh , Bhiksha Raj 2019

Voice profiling aims at inferring various human parameters from their speech, e.g. gender, age, etc. In this paper, we address the challenge posed by a subtask of voice profiling - reconstructing someones face from their voice. The task is designed t o answer the question: given an audio clip spoken by an unseen person, can we picture a face that has as many common elements, or associations as possible with the speaker, in terms of identity? To address this problem, we propose a simple but effective computational framework based on generative adversarial networks (GANs). The network learns to generate faces from voices by matching the identities of generated faces to those of the speakers, on a training set. We evaluate the performance of the network by leveraging a closely related task - cross-modal matching. The results show that our model is able to generate faces that match several biometric characteristics of the speaker, and results in matching accuracies that are much better than chance.

أنظمة الصوت في الحاسوب الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Ranking the information content of distance measures

320 - Aldo Glielmo , Claudio Zeni , Bingqing Cheng 2021

Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of the se features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches, particularly when data are sparse. We introduce a statistical test that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This in turn allows finding the most informative distance measure out of a pool of candidates. The approach is applied to find the most relevant policy variables for controlling the Covid-19 epidemic and to find compact yet informative representations of atomic structures, but its potential applications are wide ranging in many branches of science.

التعلم الالي نظرية المعلومات التعلم الآلي

Testing Properties of Multiple Distributions with Few Samples

61 - Maryam Aliakbarpour , Sandeep Silwal 2019

We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, ldots, p_s$, we design testers for the following p roblems: (1) Uniformity Testing: Testing whether all the $p_i$s are uniform or $epsilon$-far from being uniform in $ell_1$-distance (2) Identity Testing: Testing whether all the $p_i$s are equal to an explicitly given distribution $q$ or $epsilon$-far from $q$ in $ell_1$-distance, and (3) Closeness Testing: Testing whether all the $p_i$s are equal to a distribution $q$ which we have sample access to, or $epsilon$-far from $q$ in $ell_1$-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة التعلم الآلي