No Arabic abstract
Estimation of tail quantities, such as expected shortfall or Value at Risk, is a difficult problem. We show how the theory of nonlinear expectations, in particular the Data-robust expectation introduced in [5], can assist in the quantification of statistical uncertainty for these problems. However, when we are in a heavy-tailed context (in particular when our data are described by a Pareto distribution, as is common in much of extreme value theory), the theory of [5] is insufficient, and requires an additional regularization step which we introduce. By asking whether this regularization is possible, we obtain a qualitative requirement for reliable estimation of tail quantities and risk measures, in a Pareto setting.
In stochastic decision problems, one often wants to estimate the underlying probability measure statistically, and then to use this estimate as a basis for decisions. We shall consider how the uncertainty in this estimation can be explicitly and consistently incorporated in the valuation of decisions, using the theory of nonlinear expectations.
In this work we construct an optimal shrinkage estimator for the precision matrix in high dimensions. We consider the general asymptotics when the number of variables $prightarrowinfty$ and the sample size $nrightarrowinfty$ so that $p/nrightarrow cin (0, +infty)$. The precision matrix is estimated directly, without inverting the corresponding estimator for the covariance matrix. The recent results from the random matrix theory allow us to find the asymptotic deterministic equivalents of the optimal shrinkage intensities and estimate them consistently. The resulting distribution-free estimator has almost surely the minimum Frobenius loss. Additionally, we prove that the Frobenius norms of the inverse and of the pseudo-inverse sample covariance matrices tend almost surely to deterministic quantities and estimate them consistently. At the end, a simulation is provided where the suggested estimator is compared with the estimators for the precision matrix proposed in the literature. The optimal shrinkage estimator shows significant improvement and robustness even for non-normally distributed data.
We study capital process behavior in the fair-coin game and biased-coin games in the framework of the game-theoretic probability of Shafer and Vovk (2001). We show that if Skeptic uses a Bayesian strategy with a beta prior, the capital process is lucidly expressed in terms of the past average of Realitys moves. From this it is proved that the Skeptics Bayesian strategy weakly forces the strong law of large numbers (SLLN) with the convergence rate of O(sqrt{log n/n})$ and if Reality violates SLLN then the exponential growth rate of the capital process is very accurately described in terms of the Kullback divergence between the average of Realitys moves when she violates SLLN and the average when she observes SLLN. We also investigate optimality properties associated with Bayesian strategy.
We apply Gaussian process (GP) regression, which provides a powerful non-parametric probabilistic method of relating inputs to outputs, to survival data consisting of time-to-event and covariate measurements. In this context, the covariates are regarded as the `inputs and the event times are the `outputs. This allows for highly flexible inference of non-linear relationships between covariates and event times. Many existing methods, such as the ubiquitous Cox proportional hazards model, focus primarily on the hazard rate which is typically assumed to take some parametric or semi-parametric form. Our proposed model belongs to the class of accelerated failure time models where we focus on directly characterising the relationship between covariates and event times without any explicit assumptions on what form the hazard rates take. It is straightforward to include various types and combinations of censored and truncated observations. We apply our approach to both simulated and experimental data. We then apply multiple output GP regression, which can handle multiple potentially correlated outputs for each input, to competing risks survival data where multiple event types can occur. By tuning one of the model parameters we can control the extent to which the multiple outputs (the time-to-event for each risk) are dependent thus allowing the specification of correlated risks. Simulation studies suggest that in some cases assuming dependence can lead to more accurate predictions.
Performance guarantees for compression in nonlinear models under non-Gaussian observations can be achieved through the use of distributional characteristics that are sensitive to the distance to normality, and which in particular return the value of zero under Gaussian or linear sensing. The use of these characteristics, or discrepancies, improves some previous results in this area by relaxing conditions and tightening performance bounds. In addition, these characteristics are tractable to compute when Gaussian sensing is corrupted by either additive errors or mixing.