Necessary and Sufficient Conditions for Novel Word Detection in Separable Topic Models

463 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Weicong Ding

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Weicong Ding - Prakash Ishwar - Mohammad H. Rohban

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for big-data scenarios involving a network of large distributed databases.

قيم البحث

81 - Weicong Ding , Prakash Ishwar , Venkatesh Saligrama 2015

We develop necessary and sufficient conditions and a novel provably consistent and efficient algorithm for discovering topics (latent factors) from observations (documents) that are realized from a probabilistic mixture of shared latent factors that have certain properties. Our focus is on the class of topic models in which each shared latent factor contains a novel word that is unique to that factor, a property that has come to be known as separability. Our algorithm is based on the key insight that the novel words correspond to the extreme points of the convex hull formed by the row-vectors of a suitably normalized word co-occurrence matrix. We leverage this geometric insight to establish polynomial computation and sample complexity bounds based on a few isotropic random projections of the rows of the normalized word co-occurrence matrix. Our proposed random-projections-based algorithm is naturally amenable to an efficient distributed implementation and is attractive for modern web-scale distributed data mining applications.

التعلم الآلي الحساب واللغة استرجاع المعلومات

A necessary and sufficient criterion for multipartite separable states

103 - Shengjun Wu , Xuemei Chen , Yongde Zhang 2000

We present a necessary and sufficient condition for the separability of multipartite quantum states, this criterion also tells us how to write a multipartite separable state as a convex sum of separable pure states. To work out this criterion, we nee d to solve a set of equations, actually it is easy to solve these quations analytically if the density matrix of the given quantum state has few nonzero eigenvalues.

فيزياء الكم

Necessary and sufficient conditions for the identifiability of observation-driven models

59 - Franc{c}ois Roueff , Randal Douc (TIPIC-SAMOVAR 2019

In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH(p, q) model, a simple sufficient condition has been established in [1] for showing the consistency of the quasi-maximum li kelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series, such as the log-linear Poisson GARCH or the NBIN-GARCH models.

نظرية الإحصاء نظرية الإحصاء

Necessary and Sufficient Conditions for Success of the Nuclear Norm Heuristic for Rank Minimization

334 - Benjamin Recht , Weiyu Xu , Babak Hassibi 2008

Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in control theory, machine learning, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-HARD, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic algorithm replaces the rank function with the nuclear norm--equal to the sum of the singular values--of the decision variable. In this paper, we provide a necessary and sufficient condition that quantifies when this heuristic successfully finds the minimum rank solution of a linear constraint set. We additionally provide a probability distribution over instances of the affine rank minimization problem such that instances sampled from this distribution satisfy our conditions for success with overwhelming probability provided the number of constraints is appropriately large. Finally, we give empirical evidence that these probabilistic bounds provide accurate predictions of the heuristics performance in non-asymptotic scenarios.

التحسين والتحكم حساب التعلم الالي

Necessary and sufficient conditions for regularity of interval parametric matrices

70 - Evgenija D. Popova 2021

Matrix regularity is a key to various problems in applied mathematics. The sufficient conditions, used for checking regularity of interval parametric matrices, usually fail in case of large parameter intervals. We present necessary and sufficient con ditions for regularity of interval parametric matrices in terms of boundary parametric hypersurfaces, parametric solution sets, determinants, real spectral radiuses. The initial n-dimensional problem involving K interval parameters is replaced by numerous problems involving 1<= t <= min(n-1, K) interval parameters, in particular t=1 is most attractive. The advantages of the proposed methodology are discussed along with its application for finding the interval hull solution to interval parametric linear system and for determining the regularity radius of an interval parametric matrix.

التحليل العددي التحليل العددي