No Arabic abstract
Several statistics-based detectors, based on unimodal matrix models, for determining the number of sources in a field are designed. A new variance ratio statistic is proposed, and its asymptotic distribution is analyzed. The variance ratio detector is shown to outperform the alternatives. It is shown that further improvements are achievable via optimally selected rotations. Numerical experiments demonstrate the performance gains of our detection methods over the baseline approach.
Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch Self Organizing Map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modeled through a weighted graph that has been directly built from a large corpus of agrarian contracts.
Network tomography has been regarded as one of the most promising methodologies for performance evaluation and diagnosis of the massive and decentralized Internet. This paper proposes a new estimation approach for solving a class of inverse problems in network tomography, based on marginal distributions of a sequence of one-dimensional linear projections of the observed data. We give a general identifiability result for the proposed method and study the design issue of these one dimensional projections in terms of statistical efficiency. We show that for a simple Gaussian tomography model, there is an optimal set of one-dimensional projections such that the estimator obtained from these projections is asymptotically as efficient as the maximum likelihood estimator based on the joint distribution of the observed data. For practical applications, we carry out simulation studies of the proposed method for two instances of network tomography. The first is for traffic demand tomography using a Gaussian Origin-Destination traffic model with a power relation between its mean and variance, and the second is for network delay tomography where the link delays are to be estimated from the end-to-end path delays. We compare estimators obtained from our method and that obtained from using the joint distribution and other lower dimensional projections, and show that in both cases, the proposed method yields satisfactory results.
Let $X:=(X_1, ldots, X_p)$ be random objects (the inputs), defined on some probability space $(Omega,{mathcal{F}}, mathbb P)$ and valued in some measurable space $E=E_1timesldots times E_p$. Further, let $Y:=Y = f(X_1, ldots, X_p)$ be the output. Here, $f$ is a measurable function from $E$ to some Hilbert space $mathbb{H}$ ($mathbb{H}$ could be either of finite or infinite dimension). In this work, we give a natural generalization of the Sobol indices (that are classically defined when $Yinmathbb R$ ), when the output belongs to $mathbb{H}$. These indices have very nice properties. First, they are invariant. under isometry and scaling. Further they can be, as in dimension $1$, easily estimated by using the so-called Pick and Freeze method. We investigate the asymptotic behaviour of such estimation scheme.
Lag windows are commonly used in time series, econometrics, steady-state simulation, and Markov chain Monte Carlo to estimate time-average covariance matrices. In the presence of positive correlation of the underlying process, estimators of this matrix almost always exhibit significant negative bias, leading to undesirable finite-sample properties. We propose a new family of lag windows specifically designed to improve finite-sample performance by offsetting this negative bias. Any existing lag window can be adapted into a lugsail equivalent with no additional assumptions. We use these lag windows within spectral variance estimators and demonstrate its advantages in a linear regression model with autocorrelated and heteroskedastic residuals. We further employ the lugsail lag windows in weighted batch means estimators due to their computational efficiency on large simulation output. We obtain bias and variance results for these multivariate estimators and significantly weaken the mixing condition on the process. Superior finite-sample properties are illustrated in a vector autoregressive process and a Bayesian logistic regression model.
This paper makes the following original contributions. First, we develop a unifying framework for testing shape restrictions based on the Wald principle. The test has asymptotic uniform size control and is uniformly consistent. Second, we examine the applicability and usefulness of some prominent shape enforcing operators in implementing our framework. In particular, in stark contrast to its use in point and interval estimation, the rearrangement operator is inapplicable due to a lack of convexity. The greatest convex minorization and the least concave majorization are shown to enjoy the analytic properties required to employ our framework. Third, we show that, despite that the projection operator may not be well-defined/behaved in general parameter spaces such as those defined by uniform norms, one may nonetheless employ a powerful distance-based test by applying our framework. Monte Carlo simulations confirm that our test works well. We further showcase the empirical relevance by investigating the relationship between weekly working hours and the annual wage growth in the high-end labor market.