No Arabic abstract
Random uniform sampling has been studied in various statistical tasks but few of them have covered the Q-error metric for cardinality estimation (CE). In this paper, we analyze the confidence intervals of random uniform sampling with and without replacement for single-table CE. Results indicate that the upper Q-error bound depends on the sample size and true cardinality. Our bound gives a rule-of-thumb for how large a sample should be kept for single-table CE.
Kriging based on Gaussian random fields is widely used in reconstructing unknown functions. The kriging method has pointwise predictive distributions which are computationally simple. However, in many applications one would like to predict for a range of untried points simultaneously. In this work we obtain some error bounds for the (simple) kriging predictor under the uniform metric. It works for a scattered set of input points in an arbitrary dimension, and also covers the case where the covariance function of the Gaussian process is misspecified. These results lead to a better understanding of the rate of convergence of kriging under the Gaussian or the Matern correlation functions, the relationship between space-filling designs and kriging models, and the robustness of the Matern correlation functions.
We propose a Bayesian approach, called the posterior spectral embedding, for estimating the latent positions in random dot product graphs, and prove its optimality. Unlike the classical spectral-based adjacency/Laplacian spectral embedding, the posterior spectral embedding is a fully-likelihood based graph estimation method taking advantage of the Bernoulli likelihood information of the observed adjacency matrix. We develop a minimax-lower bound for estimating the latent positions, and show that the posterior spectral embedding achieves this lower bound since it both results in a minimax-optimal posterior contraction rate, and yields a point estimator achieving the minimax risk asymptotically. The convergence results are subsequently applied to clustering in stochastic block models, the result of which strengthens an existing result concerning the number of mis-clustered vertices. We also study a spectral-based Gaussian spectral embedding as a natural Bayesian analogy of the adjacency spectral embedding, but the resulting posterior contraction rate is sub-optimal with an extra logarithmic factor. The practical performance of the proposed methodology is illustrated through extensive synthetic examples and the analysis of a Wikipedia graph data.
We deal with a planar random flight ${(X(t),Y(t)),0<tleq T}$ observed at $n+1$ equidistant times $t_i=iDelta_n,i=0,1,...,n$. The aim of this paper is to estimate the unknown value of the parameter $lambda$, the underlying rate of the Poisson process. The planar random flights are not markovian, then we use an alternative argument to derive a pseudo-maximum likelihood estimator $hat{lambda}$ of the parameter $lambda$. We consider two different types of asymptotic schemes and show the consistency, the asymptotic normality and efficiency of the estimator proposed. A Monte Carlo analysis for small sample size $n$ permits us to analyze the empirical performance of $hat{lambda}$. A different approach permits us to introduce an alternative estimator of $lambda$ which is consistent, asymptotically normal and asymptotically efficient without the request of other assumptions.
We propose a one-step procedure to estimate the latent positions in random dot product graphs efficiently. Unlike the classical spectral-based methods such as the adjacency and Laplacian spectral embedding, the proposed one-step procedure takes advantage of both the low-rank structure of the expected adjacency matrix and the Bernoulli likelihood information of the sampling model simultaneously. We show that for each vertex, the corresponding row of the one-step estimator converges to a multivariate normal distribution after proper scaling and centering up to an orthogonal transformation, with an efficient covariance matrix. The initial estimator for the one-step procedure needs to satisfy the so-called approximate linearization property. The one-step estimator improves the commonly-adopted spectral embedding methods in the following sense: Globally for all vertices, it yields an asymptotic sum of squares error no greater than those of the spectral methods, and locally for each vertex, the asymptotic covariance matrix of the corresponding row of the one-step estimator dominates those of the spectral embeddings in spectra. The usefulness of the proposed one-step procedure is demonstrated via numerical examples and the analysis of a real-world Wikipedia graph dataset.
Consider the classical supervised learning problem: we are given data $(y_i,{boldsymbol x}_i)$, $ile n$, with $y_i$ a response and ${boldsymbol x}_iin {mathcal X}$ a covariates vector, and try to learn a model $f:{mathcal X}to{mathbb R}$ to predict future responses. Random features methods map the covariates vector ${boldsymbol x}_i$ to a point ${boldsymbol phi}({boldsymbol x}_i)$ in a higher dimensional space ${mathbb R}^N$, via a random featurization map ${boldsymbol phi}$. We study the use of random features methods in conjunction with ridge regression in the feature space ${mathbb R}^N$. This can be viewed as a finite-dimensional approximation of kernel ridge regression (KRR), or as a stylized model for neural networks in the so called lazy training regime. We define a class of problems satisfying certain spectral conditions on the underlying kernels, and a hypercontractivity assumption on the associated eigenfunctions. These conditions are verified by classical high-dimensional examples. Under these conditions, we prove a sharp characterization of the error of random features ridge regression. In particular, we address two fundamental questions: $(1)$~What is the generalization error of KRR? $(2)$~How big $N$ should be for the random features approximation to achieve the same error as KRR? In this setting, we prove that KRR is well approximated by a projection onto the top $ell$ eigenfunctions of the kernel, where $ell$ depends on the sample size $n$. We show that the test error of random features ridge regression is dominated by its approximation error and is larger than the error of KRR as long as $Nle n^{1-delta}$ for some $delta>0$. We characterize this gap. For $Nge n^{1+delta}$, random features achieve the same error as the corresponding KRR, and further increasing $N$ does not lead to a significant change in test error.