ترغب بنشر مسار تعليمي؟ اضغط هنا

An application of Zero-One Inflated Beta regression models for predicting health insurance reimbursement

53   0   0.0 ( 0 )
 نشر من قبل Davide Biancalana
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي مالية
والبحث باللغة English




اسأل ChatGPT حول البحث

In actuarial practice the dependency between contract limitations (deductibles, copayments) and health care expenditures are measured by the application of the Monte Carlo simulation technique. We propose, for the same goal, an alternative approach based on Generalized Linear Model for Location, Scale and Shape (GAMLSS). We focus on the estimate of the ratio between the one-year reimbursement amount (after the effect of limitations) and the one year expenditure (before the effect of limitations). We suggest a regressive model to investigate the relation between this response variable and a set of covariates, such as limitations and other rating factors related to health risk. In this way a dependency structure between reimbursement and limitations is provided. The density function of the ratio is a mixture distribution, indeed it can continuously assume values mass at 0 and 1, in addition to the probability density within (0, 1) . This random variable does not belong to the exponential family, then an ordinary Generalized Linear Model is not suitable. GAMLSS introduces a probability structure compliant with the density of the response variable, in particular zero-one inflated beta density is assumed. The latter is a mixture between a Bernoulli distribution and a Beta distribution.



قيم البحث

اقرأ أيضاً

In this paper we review Bernstein and grid-type copulas for arbitrary dimensions and general grid resolutions in connection with discrete random vectors possessing uniform margins. We further suggest a pragmatic way to fit the dependence structure of multivariate data to Bernstein copulas via grid-type copulas and empirical contingency tables. Finally, we discuss a Monte Carlo study for the simulation and PML estimation for aggregate dependent losses form observed windstorm and flooding data.
Modern RNA sequencing technologies provide gene expression measurements from single cells that promise refined insights on regulatory relationships among genes. Directed graphical models are well-suited to explore such (cause-effect) relationships. H owever, statistical analyses of single cell data are complicated by the fact that the data often show zero-inflated expression patterns. To address this challenge, we propose directed graphical models that are based on Hurdle conditional distributions parametrized in terms of polynomials in parent variables and their 0/1 indicators of being zero or nonzero. While directed graphs for Gaussian models are only identifiable up to an equivalence class in general, we show that, under a natural and weak assumption, the exact directed acyclic graph of our zero-inflated models can be identified. We propose methods for graph recovery, apply our model to real single-cell RNA-seq data on T helper cells, and show simulated experiments that validate the identifiability and graph estimation methods in practice.
Beta regression models provide an adequate approach for modeling continuous outcomes limited to the interval (0,1). This paper deals with an extension of beta regression models that allow for explanatory variables to be measured with error. The struc tural approach, in which the covariates measured with error are assumed to be random variables, is employed. Three estimation methods are presented, namely maximum likelihood, maximum pseudo-likelihood and regression calibration. Monte Carlo simulations are used to evaluate the performance of the proposed estimators and the naive estimator. Also, a residual analysis for beta regression models with measurement errors is proposed. The results are illustrated in a real data set.
A key problem in computational sustainability is to understand the distribution of species across landscapes over time. This question gives rise to challenging large-scale prediction problems since (i) hundreds of species have to be simultaneously mo deled and (ii) the survey data are usually inflated with zeros due to the absence of species for a large number of sites. The problem of tackling both issues simultaneously, which we refer to as the zero-inflated multi-target regression problem, has not been addressed by previous methods in statistics and machine learning. In this paper, we propose a novel deep model for the zero-inflated multi-target regression problem. To this end, we first model the joint distribution of multiple response variables as a multivariate probit model and then couple the positive outcomes with a multivariate log-normal distribution. By penalizing the difference between the two distributions covariance matrices, a link between both distributions is established. The whole model is cast as an end-to-end learning framework and we provide an efficient learning algorithm for our model that can be fully implemented on GPUs. We show that our model outperforms the existing state-of-the-art baselines on two challenging real-world species distribution datasets concerning bird and fish populations.
This paper proposes a maximum-likelihood approach to jointly estimate marginal conditional quantiles of multivariate response variables in a linear regression framework. We consider a slight reparameterization of the Multivariate Asymmetric Laplace distribution proposed by Kotz et al (2001) and exploit its location-scale mixture representation to implement a new EM algorithm for estimating model parameters. The idea is to extend the link between the Asymmetric Laplace distribution and the well-known univariate quantile regression model to a multivariate context, i.e. when a multivariate dependent variable is concerned. The approach accounts for association among multiple responses and study how the relationship between responses and explanatory variables can vary across different quantiles of the marginal conditional distribution of the responses. A penalized version of the EM algorithm is also presented to tackle the problem of variable selection. The validity of our approach is analyzed in a simulation study, where we also provide evidence on the efficiency gain of the proposed method compared to estimation obtained by separate univariate quantile regressions. A real data application is finally proposed to study the main determinants of financial distress in a sample of Italian firms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا