New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Sequential Gibbs Sampling Algorithm for Cognitive Diagnosis Models with Many Attributes

70 0 0.0 ( 0 )

Download Cite

Added by Juntao Wang

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors Juntao Wang - Ningzhong Shi - Xue Zhang

Methodology Computation

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cognitive diagnosis models (CDMs) are useful statistical tools to provide rich information relevant for intervention and learning. As a popular approach to estimate and make inference of CDMs, the Markov chain Monte Carlo (MCMC) algorithm is widely used in practice. However, when the number of attributes, $K$, is large, the existing MCMC algorithm may become time-consuming, due to the fact that $O(2^K)$ calculations are usually needed in the process of MCMC sampling to get the conditional distribution for each attribute profile. To overcome this computational issue, motivated by Culpepper and Hudson (2018), we propose a computationally efficient sequential Gibbs sampling method, which needs $O(K)$ calculations to sample each attribute profile. We use simulation and real data examples to show the good finite-sample performance of the proposed sequential Gibbs sampling, and its advantage over existing methods.

rate research

Hypothesis Testing for Hierarchical Structures in Cognitive Diagnosis Models

90 - Chenchen Ma , Gongjun Xu 2021

Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models widely used in educational, psychological and social sciences. In many applications of CDMs, certain hierarchical structures among the latent attributes are assumed by researchers to characterize their dependence structure. Specifically, a directed acyclic graph is used to specify hierarchical constraints on the allowable configurations of the discrete latent attributes. In this paper, we consider the important yet unaddressed problem of testing the existence of latent hierarchical structures in CDMs. We first introduce the concept of testability of hierarchical structures in CDMs and present sufficient conditions. Then we study the asymptotic behaviors of the likelihood ratio test (LRT) statistic, which is widely used for testing nested models. Due to the irregularity of the problem, the asymptotic distribution of LRT becomes nonstandard and tends to provide unsatisfactory finite sample performance under practical conditions. We provide statistical insights on such failures, and propose to use parametric bootstrap to perform the testing. We also demonstrate the effectiveness and superiority of parametric bootstrap for testing the latent hierarchies over non-parametric bootstrap and the naive Chi-squared test through comprehensive simulations and an educational assessment dataset.

Methodology

Optimal Sampling for Generalized Linear Models under Measurement Constraints

136 - Tao Zhang , Yang Ning , David Ruppert 2019

Under measurement constraints, responses are expensive to measure and initially unavailable on most of records in the dataset, but the covariates are available for the entire dataset. Our goal is to sample a relatively small portion of the dataset where the expensive responses will be measured and the resultant sampling estimator is statistically efficient. Measurement constraints require the sampling probabilities can only depend on a very small set of the responses. A sampling procedure that uses responses at most only on a small pilot sample will be called response-free. We propose a response-free sampling procedure mbox{(OSUMC)} for generalized linear models (GLMs). Using the A-optimality criterion, i.e., the trace of the asymptotic variance, the resultant estimator is statistically efficient within a class of sampling estimators. We establish the unconditional asymptotic distribution of a general class of response-free sampling estimators. This result is novel compared with the existing conditional results obtained by conditioning on both covariates and responses. Under our unconditional framework, the subsamples are no longer independent and new martingale techniques are developed for our asymptotic theory. We further derive the A-optimal response-free sampling distribution. Since this distribution depends on population level quantities, we propose the Optimal Sampling Under Measurement Constraints (OSUMC) algorithm to approximate the theoretical optimal sampling. Finally, we conduct an intensive empirical study to demonstrate the advantages of OSUMC algorithm over existing methods in both statistical and computational perspectives.

Methodology Computation

Learning Latent and Hierarchical Structures in Cognitive Diagnosis Models

110 - Chenchen Ma , Gongjun Xu 2021

Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models that are widely used in modern educational, psychological, social and biological sciences. A key component of CDMs is a binary $Q$-matrix characterizing the dependence structure between the items and the latent attributes. Additionally, researchers also assume in many applications certain hierarchical structures among the latent attributes to characterize their dependence. In most CDM applications, the attribute-attribute hierarchical structures, the item-attribute $Q$-matrix, the item-level diagnostic model, as well as the number of latent attributes, need to be fully or partially pre-specified, which however may be subjective and misspecified as noted by many recent studies. This paper considers the problem of jointly learning these latent and hierarchical structures in CDMs from observed data with minimal model assumptions. Specifically, a penalized likelihood approach is proposed to select the number of attributes and estimate the latent and hierarchical structures simultaneously. An efficient expectation-maximization (EM) algorithm and a latent structure recovery algorithm are developed, and statistical consistency theory is also established under mild conditions. The good performance of the proposed method is illustrated by simulation studies and a real data application in educational assessment.

Methodology Machine Learning

Sequential Ensemble Transform for Bayesian Inverse Problems

201 - Aaron Myers , Alexandre H. Thiery , Kainan Wang 2019

We present the Sequential Ensemble Transform (SET) method, an approach for generating approximate samples from a Bayesian posterior distribution. The method explores the posterior distribution by solving a sequence of discrete optimal transport problems to produce a series of transport plans which map prior samples to posterior samples. We prove that the sequence of Dirac mixture distributions produced by the SET method converges weakly to the true posterior as the sample size approaches infinity. Furthermore, our numerical results indicate that, when compared to standard Sequential Monte Carlo (SMC) methods, the SET approach is more robust to the choice of Markov mutation kernels and requires less computational efforts to reach a similar accuracy when used to explore complex posterior distributions. Finally, we describe adaptive schemes that allow to completely automate the use of the SET method.

Methodology Computation

Optimal transport couplings of Gibbs samplers on partitions for unbiased estimation

151 - Brian L. Trippe , Tin D. Nguyen , Tamara Broderick 2021

Computational couplings of Markov chains provide a practical route to unbiased Monte Carlo estimation that can utilize parallel computation. However, these approaches depend crucially on chains meeting after a small number of transitions. For models that assign data into groups, e.g. mixture models, the obvious approaches to couple Gibbs samplers fail to meet quickly. This failure owes to the so-called label-switching problem; semantically equivalent relabelings of the groups contribute well-separated posterior modes that impede fast mixing and cause large meeting times. We here demonstrate how to avoid label switching by considering chains as exploring the space of partitions rather than labelings. Using a metric on this space, we employ an optimal transport coupling of the Gibbs conditionals. This coupling outperforms alternative couplings that rely on labelings and, on a real dataset, provides estimates more precise than usual ergodic averages in the limited time regime. Code is available at github.com/tinnguyen96/coupling-Gibbs-partition.

Methodology Computation

comments

Fetching comments

University of Babylon

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Sequential Gibbs Sampling Algorithm for Cognitive Diagnosis Models with Many Attributes

Ask ChatGPT about the research

No Arabic abstract

Read More