أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yihong Wu

Chebyshev polynomials, moment matching, and optimal estimation of the unseen

98 - Yihong Wu , Pengkun Yang 2015

We consider the problem of estimating the support size of a discrete distribution whose minimum non-zero mass is at least $ frac{1}{k}$. Under the independent sampling model, we show that the sample complexity, i.e., the minimal sample size to achiev e an additive error of $epsilon k$ with probability at least 0.1 is within universal constant factors of $ frac{k}{log k}log^2frac{1}{epsilon} $, which improves the state-of-the-art result of $ frac{k}{epsilon^2 log k} $ in cite{VV13}. Similar characterization of the minimax risk is also obtained. Our procedure is a linear estimator based on the Chebyshev polynomial and its approximation-theoretic properties, which can be evaluated in $O(n+log^2 k)$ time and attains the sample complexity within a factor of six asymptotically. The superiority of the proposed estimator in terms of accuracy, computational efficiency and scalability is demonstrated in a variety of synthetic and real datasets.

نظرية الإحصاء نظرية الإحصاء

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions

172 - Bruce Hajek , Yihong Wu , Jiaming Xu 2015

Resolving a conjecture of Abbe, Bandeira and Hall, the authors have recently shown that the semidefinite programming (SDP) relaxation of the maximum likelihood estimator achieves the sharp threshold for exactly recovering the community structure unde r the binary stochastic block model of two equal-sized clusters. The same was shown for the case of a single cluster and outliers. Extending the proof techniques, in this paper it is shown that SDP relaxations also achieve the sharp recovery threshold in the following cases: (1) Binary stochastic block model with two clusters of sizes proportional to network size but not necessarily equal; (2) Stochastic block model with a fixed number of equal-sized clusters; (3) Binary censored block model with the background graph being ErdH{o}s-Renyi. Furthermore, a sufficient condition is given for an SDP procedure to achieve exact recovery for the general case of a fixed number of clusters plus outliers. These results demonstrate the versatility of SDP relaxation as a simple, general purpose, computationally feasible methodology for community detection.

التعلم الالي الشبكات الاجتماعية والمعلومات الاحتمالات

Minimax rates of entropy estimation on large alphabets via best polynomial approximation

145 - Yihong Wu , Pengkun Yang 2014

Consider the problem of estimating the Shannon entropy of a distribution over $k$ elements from $n$ independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of $$Big(frac{k }{n log k}Big)^2 + frac{log^2 k}{n}$$ if $n$ exceeds a constant factor of $frac{k}{log k}$; otherwise there exists no consistent estimator. This refines the recent result of Valiant-Valiant cite{VV11} that the minimal sample size for consistent entropy estimation scales according to $Theta(frac{k}{log k})$. The apparatus of best polynomial approximation plays a key role in both the construction of optimal estimators and, via a duality argument, the minimax lower bound.

نظرية المعلومات نظرية المعلومات نظرية الإحصاء

Computational Lower Bounds for Community Detection on Random Graphs

135 - Bruce Hajek , Yihong Wu , Jiaming Xu 2014

This paper studies the problem of detecting the presence of a small dense community planted in a large ErdH{o}s-Renyi random graph $mathcal{G}(N,q)$, where the edge probability within the community exceeds $q$ by a constant factor. Assuming the hardn ess of the planted clique detection problem, we show that the computational complexity of detecting the community exhibits the following phase transition phenomenon: As the graph size $N$ grows and the graph becomes sparser according to $q=N^{-alpha}$, there exists a critical value of $alpha = frac{2}{3}$, below which there exists a computationally intensive procedure that can detect far smaller communities than any computationally efficient procedure, and above which a linear-time procedure is statistically optimal. The results also lead to the average-case hardness results for recovering the dense community and approximating the densest $K$-subgraph.

نظرية الإحصاء التعقيد الحسابي التعلم الالي

Electrical transport across metal/two-dimensional carbon junctions: Edge versus side contacts

173 - Yihong Wu , Ying Wang , Jiayi Wang 2012

Metal/two-dimensional carbon junctions are characterized by using a nanoprobe in an ultrahigh vacuum environment. Significant differences were found in bias voltage (V) dependence of differential conductance (dI/dV) between edge- and side-contact; th e former exhibits a clear linear relationship (i.e., dI/dV propto V), whereas the latter is characterized by a nonlinear dependence, dI/dV propto V3/2. Theoretical calculations confirm the experimental results, which are due to the robust two-dimensional nature of the carbon materials under study. Our work demonstrates the importance of contact geometry in graphene-based electronic devices.

الفيزياء ميسكالي وننكالي

Visibility study of graphene multilayer structures

54 - Guoquan Teo , Haomin Wang , Yihong Wu 2008

The visibility of graphene sheets on different types of substrates has been investigated both theoretically and experimentally. Although single layer graphene is observable on various types of dielectrics under an optical microscope, it is invisible when it is placed directly on most of the semiconductor and metallic substrates. We show that coating of a resist layer with optimum thickness is an effective way to enhance the contrast of graphene on various types of substrates and makes single layer graphene visible on most semiconductor and metallic substrates. Experiments have been performed to verify the results on quartz and NiFe-coated Si substrates. The results obtained will be useful for fabricating graphene-based devices on various types of substrates for electronics, spintronics and optoelectronics applications.

علم المواد

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد