Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

merlin - a unified modelling framework for data analysis and methods development in Stata

83 0 0.0 ( 0 )

Download Cite

Added by Michael Crowther

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Michael J. Crowther

Computation Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

merlin can do a lot of things. From simple stuff, like fitting a linear regression or a Weibull survival model, to a three-level logistic mixed effects model, or a multivariate joint model of multiple longitudinal outcomes (of different types) and a recurrent event and survival with non-linear effects...the list is rather endless. merlin can do things I havent even thought of yet. Ill take a single dataset, and attempt to show you the full range of capabilities of merlin, and discuss some future directions for the implementation in Stata.

rate research

A unified performance analysis of likelihood-informed subspace methods

55 - Tiangang Cui , Xin T. Tong 2021

The likelihood-informed subspace (LIS) method offers a viable route to reducing the dimensionality of high-dimensional probability distributions arisen in Bayesian inference. LIS identifies an intrinsic low-dimensional linear subspace where the target distribution differs the most from some tractable reference distribution. Such a subspace can be identified using the leading eigenvectors of a Gram matrix of the gradient of the log-likelihood function. Then, the original high-dimensional target distribution is approximated through various forms of ridge approximations of the likelihood function, in which the approximated likelihood only has support on the intrinsic low-dimensional subspace. This approximation enables the design of inference algorithms that can scale sub-linearly with the apparent dimensionality of the problem. Intuitively, the accuracy of the approximation, and hence the performance of the inference algorithms, are influenced by three factors -- the dimension truncation error in identifying the subspace, Monte Carlo error in estimating the Gram matrices, and Monte Carlo error in constructing ridge approximations. This work establishes a unified framework to analysis each of these three factors and their interplay. Under mild technical assumptions, we establish error bounds for a range of existing dimension reduction techniques based on the principle of LIS. Our error bounds also provide useful insights into the accuracy comparison of these methods. In addition, we analyze the integration of LIS with sampling methods such as Markov Chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC). We also demonstrate our analyses on a linear inverse problem with Gaussian prior, which shows that all the estimates can be dimension-independent if the prior covariance is a trace-class operator.

Computation Statistics Theory Statistics Theory

Gaussian Process for Functional Data Analysis: The GPFDA Package for R

179 - Evandro Konzen , Yafeng Cheng , Jian Qing Shi 2021

We present and describe the GPFDA package for R. The package provides flexible functionalities for dealing with Gaussian process regression (GPR) models for functional data. Multivariate functional data, functional data with multidimensional inputs, and nonseparable and/or nonstationary covariance structures can be modeled. In addition, the package fits functional regression models where the mean function depends on scalar and/or functional covariates and the covariance structure is modeled by a GPR model. In this paper, we present the versatility of GPFDA with respect to mean function and covariance function specifications and illustrate the implementation of estimation and prediction of some models through reproducible numerical examples.

Computation Methodology

Manifold Markov chain Monte Carlo methods for Bayesian inference in a wide class of diffusion models

72 - Matthew M. Graham , Alexandre H. Thiery , Alexandros Beskos 2019

Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology, borrowing ideas from statistical physics and computational chemistry, for inferring the posterior distribution of latent diffusion paths and model parameters, given observations of the process. Joint configurations of the underlying process noise and of parameters, mapping onto diffusion paths consistent with observations, form an implicitly defined manifold. Then, by making use of a constrained Hamiltonian Monte Carlo algorithm on the embedded manifold, we are able to perform computationally efficient inference for an extensive class of discretely observed diffusion models. Critically, in contrast with other approaches proposed in the literature, our methodology is highly automated, requiring minimal user intervention and applying alike in a range of settings, including: elliptic or hypo-elliptic systems; observations with or without noise; linear or non-linear observation operators. Exploiting Markovianity, we propose a variant of the method with complexity that scales linearly in the resolution of path discretisation and the number of observation times.

Computation Methodology

PoolTestR: An R package for estimating prevalence and regression modelling with pooled samples

172 - Angus McLure , Ben ONeill , Helen Mayfield 2020

Pooled testing (also known as group testing), where diagnostic tests are performed on pooled samples, has broad applications in the surveillance of diseases in animals and humans. An increasingly common use case is molecular xenomonitoring (MX), where surveillance of vector-borne diseases is conducted by capturing and testing large numbers of vectors (e.g. mosquitoes). The R package PoolTestR was developed to meet the needs of increasingly large and complex molecular xenomonitoring surveys but can be applied to analyse any data involving pooled testing. PoolTestR includes simple and flexible tools to estimate prevalence and fit fixed- and mixed-effect generalised linear models for pooled data in frequentist and Bayesian frameworks. Mixed-effect models allow users to account for the hierarchical sampling designs that are often employed in surveys, including MX. We demonstrate the utility of PoolTestR by applying it to a large synthetic dataset that emulates a MX survey with a hierarchical sampling design.

Computation Methodology Other Statistics

Optimal subsampling for quantile regression in big data

120 - HaiYing Wang , Yanyuan Ma 2020

We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive tw

Computation Methodology

comments

Fetching comments

Peninsula Private University

Additional details More universities

merlin - a unified modelling framework for data analysis and methods development in Stata

Ask ChatGPT about the research

No Arabic abstract

Read More