ﻻ يوجد ملخص باللغة العربية
Statistical analysis of massive datasets very often implies expensive linear algebra operations with large dense matrices. Typical tasks are an estimation of unknown parameters of the underlying statistical model and prediction of missing values. We developed the H-MLE procedure, which solves these tasks. The unknown parameters can be estimated by maximizing the joint Gaussian log-likelihood function, which depends on a covariance matrix. To decrease high computational cost, we approximate the covariance matrix in the hierarchical (H-) matrix format. The H-matrix technique allows us to work with inhomogeneous covariance matrices and almost arbitrary locations. Especially, H-matrices can be applied in cases when the matrices under consideration are dense and unstructured. For validation purposes, we implemented three machine learning methods: the k-nearest neighbors (kNN), random forest, and deep neural network. The best results (for the given datasets) were obtained by the kNN method with three or seven neighbors depending on the dataset. The results computed with the H-MLE method were compared with the results obtained by the kNN method. The developed H-matrix code and all datasets are freely available online.
We introduce an ensemble Markov chain Monte Carlo approach to sampling from a probability density with known likelihood. This method upgrades an underlying Markov chain by allowing an ensemble of such chains to interact via a process in which one cha
In this work we introduce a reduced-rank algorithm for Gaussian process regression. Our numerical scheme converts a Gaussian process on a user-specified interval to its Karhunen-Lo`eve expansion, the $L^2$-optimal reduced-rank representation. Numeric
The stochastic partial differential equation approach to Gaussian processes (GPs) represents Matern GP priors in terms of $n$ finite element basis functions and Gaussian coefficients with sparse precision matrix. Such representations enhance the scal
We describe a numerical scheme for evaluating the posterior moments of Bayesian linear regression models with partial pooling of the coefficients. The principal analytical tool of the evaluation is a change of basis from coefficient space to the spac
Markov Chain Monte Carlo methods become increasingly popular in applied mathematics as a tool for numerical integration with respect to complex and high-dimensional distributions. However, application of MCMC methods to heavy tailed distributions and