ترغب بنشر مسار تعليمي؟ اضغط هنا

Necessary and Sufficient Conditions for Novel Word Detection in Separable Topic Models

155   0   0.0 ( 0 )
 نشر من قبل Weicong Ding
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for big-data scenarios involving a network of large distributed databases.



قيم البحث

اقرأ أيضاً

We develop necessary and sufficient conditions and a novel provably consistent and efficient algorithm for discovering topics (latent factors) from observations (documents) that are realized from a probabilistic mixture of shared latent factors that have certain properties. Our focus is on the class of topic models in which each shared latent factor contains a novel word that is unique to that factor, a property that has come to be known as separability. Our algorithm is based on the key insight that the novel words correspond to the extreme points of the convex hull formed by the row-vectors of a suitably normalized word co-occurrence matrix. We leverage this geometric insight to establish polynomial computation and sample complexity bounds based on a few isotropic random projections of the rows of the normalized word co-occurrence matrix. Our proposed random-projections-based algorithm is naturally amenable to an efficient distributed implementation and is attractive for modern web-scale distributed data mining applications.
We present a necessary and sufficient condition for the separability of multipartite quantum states, this criterion also tells us how to write a multipartite separable state as a convex sum of separable pure states. To work out this criterion, we nee d to solve a set of equations, actually it is easy to solve these quations analytically if the density matrix of the given quantum state has few nonzero eigenvalues.
In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH(p, q) model, a simple sufficient condition has been established in [1] for showing the consistency of the quasi-maximum li kelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series, such as the log-linear Poisson GARCH or the NBIN-GARCH models.
Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in control theory, machine learning, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-HARD, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic algorithm replaces the rank function with the nuclear norm--equal to the sum of the singular values--of the decision variable. In this paper, we provide a necessary and sufficient condition that quantifies when this heuristic successfully finds the minimum rank solution of a linear constraint set. We additionally provide a probability distribution over instances of the affine rank minimization problem such that instances sampled from this distribution satisfy our conditions for success with overwhelming probability provided the number of constraints is appropriately large. Finally, we give empirical evidence that these probabilistic bounds provide accurate predictions of the heuristics performance in non-asymptotic scenarios.
70 - Evgenija D. Popova 2021
Matrix regularity is a key to various problems in applied mathematics. The sufficient conditions, used for checking regularity of interval parametric matrices, usually fail in case of large parameter intervals. We present necessary and sufficient con ditions for regularity of interval parametric matrices in terms of boundary parametric hypersurfaces, parametric solution sets, determinants, real spectral radiuses. The initial n-dimensional problem involving K interval parameters is replaced by numerous problems involving 1<= t <= min(n-1, K) interval parameters, in particular t=1 is most attractive. The advantages of the proposed methodology are discussed along with its application for finding the interval hull solution to interval parametric linear system and for determining the regularity radius of an interval parametric matrix.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا