No Arabic abstract
We consider Bayesian inference of sparse covariance matrices and propose a post-processed posterior. This method consists of two steps. In the first step, posterior samples are obtained from the conjugate inverse-Wishart posterior without considering the sparse structural assumption. The posterior samples are transformed in the second step to satisfy the sparse structural assumption through the hard-thresholding function. This non-traditional Bayesian procedure is justified by showing that the post-processed posterior attains the optimal minimax rates. We also investigate the application of the post-processed posterior to the estimation of the global minimum variance portfolio. We show that the post-processed posterior for the global minimum variance portfolio also attains the optimal minimax rate under the sparse covariance assumption. The advantages of the post-processed posterior for the global minimum variance portfolio are demonstrated by a simulation study and a real data analysis with S&P 400 data.
We consider Bayesian inference of banded covariance matrices and propose a post-processed posterior. The post-processing of the posterior consists of two steps. In the first step, posterior samples are obtained from the conjugate inverse-Wishart posterior which does not satisfy any structural restrictions. In the second step, the posterior samples are transformed to satisfy the structural restriction through a post-processing function. The conceptually straightforward procedure of the post-processed posterior makes its computation efficient and can render interval estimators of functionals of covariance matrices. We show that it has nearly optimal minimax rates for banded covariances among all possible pairs of priors and post-processing functions. Furthermore, we prove that the expected coverage probability of the $(1-alpha)100%$ highest posterior density region of the post-processed posterior is asymptotically $1-alpha$ with respect to a conventional posterior distribution. It implies that the highest posterior density region of the post-processed posterior is, on average, a credible set of a conventional posterior. The advantages of the post-processed posterior are demonstrated by a simulation study and a real data analysis.
Statistical inference for sparse covariance matrices is crucial to reveal dependence structure of large multivariate data sets, but lacks scalable and theoretically supported Bayesian methods. In this paper, we propose beta-mixture shrinkage prior, computationally more efficient than the spike and slab prior, for sparse covariance matrices and establish its minimax optimality in high-dimensional settings. The proposed prior consists of beta-mixture shrinkage and gamma priors for off-diagonal and diagonal entries, respectively. To ensure positive definiteness of the resulting covariance matrix, we further restrict the support of the prior to a subspace of positive definite matrices. We obtain the posterior convergence rate of the induced posterior under the Frobenius norm and establish a minimax lower bound for sparse covariance matrices. The class of sparse covariance matrices for the minimax lower bound considered in this paper is controlled by the number of nonzero off-diagonal elements and has more intuitive appeal than those appeared in the literature. The obtained posterior convergence rate coincides with the minimax lower bound unless the true covariance matrix is extremely sparse. In the simulation study, we show that the proposed method is computationally more efficient than competitors, while achieving comparable performance. Advantages of the shrinkage prior are demonstrated based on two real data sets.
The only input to attain the portfolio weights of global minimum variance portfolio (GMVP) is the covariance matrix of returns of assets being considered for investment. Since the population covariance matrix is not known, investors use historical data to estimate it. Even though sample covariance matrix is an unbiased estimator of the population covariance matrix, it includes a great amount of estimation error especially when the number of observed data is not much bigger than number of assets. As it is difficult to estimate the covariance matrix with high dimensionality all at once, clustering stocks is proposed to come up with covariance matrix in two steps: firstly, within a cluster and secondly, between clusters. It decreases the estimation error by reducing the number of features in the data matrix. The motivation of this dissertation is that the estimation error can still remain high even after clustering, if a large amount of stocks is clustered together in a single group. This research proposes to utilize a bounded clustering method in order to limit the maximum cluster size. The result of experiments shows that not only the gap between in-sample volatility and out-of-sample volatility decreases, but also the out-of-sample volatility gets reduced. It implies that we need a bounded clustering algorithm so that maximum clustering size can be precisely controlled to find the best portfolio performance.
Our problem is to find a good approximation to the P-value of the maximum of a random field of test statistics for a cone alternative at each point in a sample of Gaussian random fields. These test statistics have been proposed in the neuroscience literature for the analysis of fMRI data allowing for unknown delay in the hemodynamic response. However the null distribution of the maximum of this 3D random field of test statistics, and hence the threshold used to detect brain activation, was unsolved. To find a solution, we approximate the P-value by the expected Euler characteristic (EC) of the excursion set of the test statistic random field. Our main result is the required EC density, derived using the Gaussian Kinematic Formula.
In this study, we construct two tests for the weights of the global minimum variance portfolio (GMVP) in a high-dimensional setting, namely, when the number of assets $p$ depends on the sample size $n$ such that $frac{p}{n}to c in (0,1)$ as $n$ tends to infinity. In the case of a singular covariance matrix with rank equal to $q$ we assume that $q/nto tilde{c}in(0, 1)$ as $ntoinfty$. The considered tests are based on the sample estimator and on the shrinkage estimator of the GMVP weights. We derive the asymptotic distributions of the test statistics under the null and alternative hypotheses. Moreover, we provide a simulation study where the power functions and the receiver operating characteristic curves of the proposed tests are compared with other existing approaches. We observe that the test based on the shrinkage estimator performs well even for values of $c$ close to one.