ﻻ يوجد ملخص باللغة العربية
We consider the following general hidden hubs model: an $n times n$ random matrix $A$ with a subset $S$ of $k$ special rows (hubs): entries in rows outside $S$ are generated from the probability distribution $p_0 sim N(0,sigma_0^2)$; for each row in $S$, some $k$ of its entries are generated from $p_1 sim N(0,sigma_1^2)$, $sigma_1>sigma_0$, and the rest of the entries from $p_0$. The problem is to identify the high-degree hubs efficiently. This model includes and significantly generalizes the planted Gaussian Submatrix Model, where the special entries are all in a $k times k$ submatrix. There are two well-known barriers: if $kgeq csqrt{nln n}$, just the row sums are sufficient to find $S$ in the general model. For the submatrix problem, this can be improved by a $sqrt{ln n}$ factor to $k ge csqrt{n}$ by spectral methods or combinatorial methods. In the variant with $p_0=pm 1$ (with probability $1/2$ each) and $p_1equiv 1$, neither barrier has been broken. We give a polynomial-time algorithm to identify all the hidden hubs with high probability for $k ge n^{0.5-delta}$ for some $delta >0$, when $sigma_1^2>2sigma_0^2$. The algorithm extends to the setting where planted entries might have different variances each at least as large as $sigma_1^2$. We also show a nearly matching lower bound: for $sigma_1^2 le 2sigma_0^2$, there is no polynomial-time Statistical Query algorithm for distinguishing between a matrix whose entries are all from $N(0,sigma_0^2)$ and a matrix with $k=n^{0.5-delta}$ hidden hubs for any $delta >0$. The lower bound as well as the algorithm are related to whether the chi-squared distance of the two distributions diverges. At the critical value $sigma_1^2=2sigma_0^2$, we show that the general hidden hubs problem can be solved for $kgeq csqrt n(ln n)^{1/4}$, improving on the naive row sum-based method.
Reduced chi-squared is a very popular method for model assessment, model comparison, convergence diagnostic, and error estimation in astronomy. In this manuscript, we discuss the pitfalls involved in using reduced chi-squared. There are two independe
The density matrix in quantum mechanics parameterizes the statistical properties of the system under observation, just like a classical probability distribution does for classical systems. The expectation value of observables cannot be measured direc
We investigate the statistics of stationary points in the sum of squares of $N$ Gaussian random fields, which we call a chi-squared field. The behavior of such a field at a point is investigated, with particular attention paid to the formation of top
The darknet markets are notorious black markets in cyberspace, which involve selling or brokering drugs, weapons, stolen credit cards, and other illicit goods. To combat illicit transactions in the cyberspace, it is important to analyze the behaviors
Monitoring network traffic data to detect any hidden patterns of anomalies is a challenging and time-consuming task that requires high computing resources. To this end, an appropriate summarization technique is of great importance, where it can be a