No Arabic abstract
Exponential tail bounds for sums play an important role in statistics, but the example of the $t$-statistic shows that the exponential tail decay may be lost when population parameters need to be estimated from the data. However, it turns out that if Studentizing is accompanied by estimating the location parameter in a suitable way, then the $t$-statistic regains the exponential tail behavior. Motivated by this example, the paper analyzes other ways of empirically standardizing sums and establishes tail bounds that are sub-Gaussian or even closer to normal for the following settings: Standardization with Studentized contrasts for normal observations, standardization with the log likelihood ratio statistic for observations from an exponential family, and standardization via self-normalization for observations from a symmetric distribution with unknown center of symmetry. The latter standardization gives rise to a novel scan statistic for heteroscedastic data whose asymptotic power is analyzed.
We develop new approaches in multi-class settings for constructing proper scoring rules and hinge-like losses and establishing corresponding regret bounds with respect to the zero-one or cost-weighted classification loss. Our construction of losses involves deriving new inverse mappings from a concave generalized entropy to a loss through the use of a convex dissimilarity function related to the multi-distribution $f$-divergence. Moreover, we identify new classes of multi-class proper scoring rules, which also recover and reveal interesting relationships between various composite losses currently in use. We establish new classification regret bounds in general for multi-class proper scoring rules by exploiting the Bregman divergences of the associated generalized entropies, and, as applications, provide simple meaningful regret bounds for two specific classes of proper scoring rules. Finally, we derive new hinge-like convex losses, which are tighter convex extensions than related hinge-like losses and geometrically simpler with fewer non-differentiable edges, while achieving similar regret bounds. We also establish a general classification regret bound for all losses which induce the same generalized entropy as the zero-one loss.
This paper deals with the dimension reduction for high-dimensional time series based on common factors. In particular we allow the dimension of time series $p$ to be as large as, or even larger than, the sample size $n$. The estimation for the factor loading matrix and the factor process itself is carried out via an eigenanalysis for a $ptimes p$ non-negative definite matrix. We show that when all the factors are strong in the sense that the norm of each column in the factor loading matrix is of the order $p^{1/2}$, the estimator for the factor loading matrix, as well as the resulting estimator for the precision matrix of the original $p$-variant time series, are weakly consistent in $L_2$-norm with the convergence rates independent of $p$. This result exhibits clearly that the `curse is canceled out by the `blessings in dimensionality. We also establish the asymptotic properties of the estimation when not all factors are strong. For the latter case, a two-step estimation procedure is preferred accordingly to the asymptotic theory. The proposed methods together with their asymptotic properties are further illustrated in a simulation study. An application to a real data set is also reported.
The gamma model is a generalized linear model for gamma-distributed outcomes. The model is widely applied in psychology, ecology or medicine. In this paper we focus on gamma models having a linear predictor without intercept. For a specific scenario sets of locally D- and A-optimal designs are to be developed. Recently, Gaffke et al. (2018) established a complete class and an essentially complete class of designs for gamma models to obtain locally D-optimal designs. However to extend this approach to gamma model without an intercept term is complicated. To solve that further techniques have to be developed in the current work. Further, by a suitable transformation between gamma models with and without intercept optimality results may be transferred from one model to the other. Additionally by means of The General Equivalence Theorem optimality can be characterized for multiple regression by a system of polynomial inequalities which can be solved analytically or by computer algebra. By this necessary and sufficient conditions on the parameter values can be obtained for the local D-optimality of particular designs. The robustness of the derived designs with respect to misspecifications of the initial parameter values is examined by means of their local D-efficiencies.
We derive the optimal proposal density for Approximate Bayesian Computation (ABC) using Sequential Monte Carlo (SMC) (or Population Monte Carlo, PMC). The criterion for optimality is that the SMC/PMC-ABC sampler maximise the effective number of samples per parameter proposal. The optimal proposal density represents the optimal trade-off between favoring high acceptance rate and reducing the variance of the importance weights of accepted samples. We discuss two convenient approximations of this proposal and show that the optimal proposal density gives a significant boost in the expected sampling efficiency compared to standard kernels that are in common use in the ABC literature, especially as the number of parameters increases.
Matching methods are widely used for causal inference in observational studies. Among them, nearest neighbor matching is arguably the most popular. However, nearest neighbor matching does not generally yield an average treatment effect estimator that is $sqrt{n}$-consistent (Abadie and Imbens, 2006). Are matching methods not $sqrt{n}$-consistent in general? In this paper, we study a recent class of matching methods that use integer programming to directly target aggregate covariate balance as opposed to finding close neighbor matches. We show that under suitable conditions these methods can yield simple estimators that are $sqrt{n}$-consistent and asymptotically optimal.