ترغب بنشر مسار تعليمي؟ اضغط هنا

90 - Chenchen Ma , Gongjun Xu 2021
Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models widely used in educational, psychological and social sciences. In many applications of CDMs, certain hierarchical structures among the latent attributes are ass umed by researchers to characterize their dependence structure. Specifically, a directed acyclic graph is used to specify hierarchical constraints on the allowable configurations of the discrete latent attributes. In this paper, we consider the important yet unaddressed problem of testing the existence of latent hierarchical structures in CDMs. We first introduce the concept of testability of hierarchical structures in CDMs and present sufficient conditions. Then we study the asymptotic behaviors of the likelihood ratio test (LRT) statistic, which is widely used for testing nested models. Due to the irregularity of the problem, the asymptotic distribution of LRT becomes nonstandard and tends to provide unsatisfactory finite sample performance under practical conditions. We provide statistical insights on such failures, and propose to use parametric bootstrap to perform the testing. We also demonstrate the effectiveness and superiority of parametric bootstrap for testing the latent hierarchies over non-parametric bootstrap and the naive Chi-squared test through comprehensive simulations and an educational assessment dataset.
We consider the statistical inference for noisy incomplete 1-bit matrix. Instead of observing a subset of real-valued entries of a matrix M, we only have one binary (1-bit) measurement for each entry in this subset, where the binary measurement follo ws a Bernoulli distribution whose success probability is determined by the value of the entry. Despite the importance of uncertainty quantification to matrix completion, most of the categorical matrix completion literature focus on point estimation and prediction. This paper moves one step further towards the statistical inference for 1-bit matrix completion. Under a popular nonlinear factor analysis model, we obtain a point estimator and derive its asymptotic distribution for any linear form of M and latent factor scores. Moreover, our analysis adopts a flexible missing-entry design that does not require a random sampling scheme as required by most of the existing asymptotic results for matrix completion. The proposed estimator is statistically efficient and optimal, in the sense that the Cramer-Rao lower bound is achieved asymptotically for the model parameters. Two applications are considered, including (1) linking two forms of an educational test and (2) linking the roll call voting records from multiple years in the United States senate. The first application enables the comparison between examinees who took different test forms, and the second application allows us to compare the liberal-conservativeness of senators who did not serve in the senate at the same time.
Recurrent event analyses have found a wide range of applications in biomedicine, public health, and engineering, among others, where study subjects may experience a sequence of event of interest during follow-up. The R package reReg (Chiou and Huang 2021) offers a comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, possibly with the presence of an informative terminal event. The regression framework is a general scale-change model which encompasses the popular Cox-type model, the accelerated rate model, and the accelerated mean model as special cases. Informative censoring is accommodated through a subject-specific frailty without no need for parametric specification. Different regression models are allowed for the recurrent event process and the terminal event. Also included are visualization and simulation tools.
110 - Chenchen Ma , Gongjun Xu 2021
Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models that are widely used in modern educational, psychological, social and biological sciences. A key component of CDMs is a binary $Q$-matrix characterizing the dep endence structure between the items and the latent attributes. Additionally, researchers also assume in many applications certain hierarchical structures among the latent attributes to characterize their dependence. In most CDM applications, the attribute-attribute hierarchical structures, the item-attribute $Q$-matrix, the item-level diagnostic model, as well as the number of latent attributes, need to be fully or partially pre-specified, which however may be subjective and misspecified as noted by many recent studies. This paper considers the problem of jointly learning these latent and hierarchical structures in CDMs from observed data with minimal model assumptions. Specifically, a penalized likelihood approach is proposed to select the number of attributes and estimate the latent and hierarchical structures simultaneously. An efficient expectation-maximization (EM) algorithm and a latent structure recovery algorithm are developed, and statistical consistency theory is also established under mild conditions. The good performance of the proposed method is illustrated by simulation studies and a real data application in educational assessment.
Latent class models are powerful statistical modeling tools widely used in psychological, behavioral, and social sciences. In the modern era of data science, researchers often have access to response data collected from large-scale surveys or assessm ents, featuring many items (large J) and many subjects (large N). This is in contrary to the traditional regime with fixed J and large N. To analyze such large-scale data, it is important to develop methods that are both computationally efficient and theoretically valid. In terms of computation, the conventional EM algorithm for latent class models tends to have a slow algorithmic convergence rate for large-scale data and may converge to some local optima instead of the maximum likelihood estimator (MLE). Motivated by this, we introduce the tensor decomposition perspective into latent class analysis. Methodologically, we propose to use a moment-based tensor power method in the first step, and then use the obtained estimators as initialization for the EM algorithm in the second step. Theoretically, we establish the clustering consistency of the MLE in assigning subjects into latent classes when N and J both go to infinity. Simulation studies suggest that the proposed tensor-EM pipeline enjoys both good accuracy and computational efficiency for large-scale data. We also apply the proposed method to a personality dataset as an illustration.
109 - Jing Ouyang , Gongjun Xu 2021
Latent class models with covariates are widely used for psychological, social, and educational researches. Yet the fundamental identifiability issue of these models has not been fully addressed. Among the previous researches on the identifiability of latent class models containing covariates, Huang and Bandeen-Roche (2004, Psychometrika, 69:5-32) studied the local identifiability conditions. However, motivated by recent advances in the identifiability of restricted latent class models, particularly the Cognitive Diagnosis Models (CDMs), we show in this work that the conditions in Huang and Bandeen-Roche (2004) are only necessary but not sufficient to determine the local identifiability of the model parameters. To address the open identifiability issue for latent class models with covariates, this work establishes conditions to ensure the global identifiability of the model parameters in both strict and generic sense. Moreover, our results extend to polytomous-response CDMs with covariates, which generalizes the existing identifiability results for CDMs.
49 - Yuqi Gu , Gongjun Xu 2020
Structured Latent Attribute Models (SLAMs) are a family of discrete latent variable models widely used in education, psychology, and epidemiology to model multivariate categorical data. A SLAM assumes that multiple discrete latent attributes explain the dependence of observed variables in a highly structured fashion. Usually, the maximum marginal likelihood estimation approach is adopted for SLAMs, treating the latent attributes as random effects. The increasing scope of modern assessment data involves large numbers of observed variables and high-dimensional latent attributes. This poses challenges to classical estimation methods and requires new methodology and understanding of latent variable modeling. Motivated by this, we consider the joint maximum likelihood estimation (MLE) approach to SLAMs, treating latent attributes as fixed unknown parameters. We investigate estimability, consistency, and computation in the regime where sample size, number of variables, and number of latent attributes all can diverge. We establish the statistical consistency of the joint MLE and propose efficient algorithms that scale well to large-scale data for several popular SLAMs. Simulation studies demonstrate the superior empirical performance of the proposed methods. An application to real data from an international educational assessment gives interpretable findings of cognitive diagnosis.
This paper introduces a general framework for survival analysis based on ordinary differential equations (ODE). Specifically, this framework unifies many existing survival models, including proportional hazards models, linear transformation models, a ccelerated failure time models, and time-varying coefficient models as special cases. Such a unified framework provides a novel perspective on modeling censored data and offers opportunities for designing new and more flexible survival model structures. Further, the aforementioned existing survival models are traditionally estimated by procedures that suffer from lack of scalability, statistical inefficiency, or implementation difficulty. Based on well-established numerical solvers and sensitivity analysis tools for ODEs, we propose a novel, scalable, and easy-to-implement general estimation procedure that is applicable to a wide range of models. In particular, we develop a sieve maximum likelihood estimator for a general semi-parametric class of ODE models as an illustrative example. We also establish a general sieve M-theorem for bundled parameters and show that the proposed sieve estimator is consistent and asymptotically normal, and achieves the semi-parametric efficiency bound. The finite sample performance of the proposed estimator is examined in simulation studies and a real-world data example.
This paper investigates the (in)-consistency of various bootstrap methods for making inference on a change-point in time in the Cox model with right censored survival data. A criterion is established for the consistency of any bootstrap method. It is shown that the usual nonparametric bootstrap is inconsistent for the maximum partial likelihood estimation of the change-point. A new model-based bootstrap approach is proposed and its consistency established. Simulation studies are carried out to assess the performance of various bootstrap schemes.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا