Do you want to publish a course? Click here

Data Envelopment Analysis models with imperfect knowledge of input and output values: An application to Portuguese public hospitals

68   0   0.0 ( 0 )
 Added by Salvatore Greco
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Assessing the technical efficiency of a set of observations requires that the associated data composed of inputs and outputs are perfectly known. If this is not the case, then biased estimates will likely be obtained. Data Envelopment Analysis (DEA) is one of the most extensively used mathematical models to estimate efficiency. It constructs a piecewise linear frontier against which all observations are compared. Since the frontier is empirically defined, any deviation resulting from low data quality (imperfect knowledge of data or IKD) may lead to efficiency under/overestimation. In this study, we model IKD and, then, apply the so-called Hit & Run procedure to randomly generate admissible observations, following some prespecified probability density functions. Sets used to model IKD limit the domain of data associated with each observation. Any point belonging to that domain is a candidate to figure out as the observation for efficiency assessment. Hence, this sampling procedure must run a sizable number of times (infinite, in theory) in such a way that it populates the whole sets. The DEA technique is used during the execution of each iteration to estimate bootstrapped efficiency scores for each observation. We use some scenarios to show that the proposed routine can outperform some of the available alternatives. We also explain how efficiency estimations can be used for statistical inference. An empirical case study based on the Portuguese public hospitals database (2013-2016) was addressed using the proposed method.



rate research

Read More

There are several cutting edge applications needing PCA methods for data on tori and we propose a novel torus-PCA method with important properties that can be generally applied. There are two existing general methods: tangent space PCA and geodesic PCA. However, unlike tangent space PCA, our torus-PCA honors the cyclic topology of the data space whereas, unlike geodesic PCA, our torus-PCA produces a variety of non-winding, non-dense descriptors. This is achieved by deforming tori into spheres and then using a variant of the recently developed principle nested spheres analysis. This PCA analysis involves a step of small sphere fitting and we provide an improved test to avoid overfitting. However, deforming tori into spheres creates singularities. We introduce a data-adaptive pre-clustering technique to keep the singularities away from the data. For the frequently encountered case that the residual variance around the PCA main component is small, we use a post-mode hunting technique for more fine-grained clustering. Thus in general, there are three successive interrelated key steps of torus-PCA in practice: pre-clustering, deformation, and post-mode hunting. We illustrate our method with two recently studied RNA structure (tori) data sets: one is a small RNA data set which is established as the benchmark for PCA and we validate our method through this data. Another is a large RNA data set (containing the small RNA data set) for which we show that our method provides interpretable principal components as well as giving further insight into its structure.
A new class of survival frailty models based on the Generalized Inverse-Gaussian (GIG) distributions is proposed. We show that the GIG frailty models are flexible and mathematically convenient like the popular gamma frailty model. Furthermore, our proposed class is robust and does not present some computational issues experienced by the gamma model. By assuming a piecewise-exponential baseline hazard function, which gives a semiparametric flavour for our frailty class, we propose an EM-algorithm for estimating the model parameters and provide an explicit expression for the information matrix. Simulated results are addressed to check the finite sample behavior of the EM-estimators and also to study the performance of the GIG models under misspecification. We apply our methodology to a TARGET (Therapeutically Applicable Research to Generate Effective Treatments) data about survival time of patients with neuroblastoma cancer and show some advantages of the GIG frailties over existing models in the literature.
The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.
In Functional Data Analysis, data are commonly assumed to be smooth functions on a fixed interval of the real line. In this work, we introduce a comprehensive framework for the analysis of functional data, whose domain is a two-dimensional manifold and the domain itself is subject to variability from sample to sample. We formulate a statistical model for such data, here called Functions on Surfaces, which enables a joint representation of the geometric and functional aspects, and propose an associated estimation framework. We assess the validity of the framework by performing a simulation study and we finally apply it to the analysis of neuroimaging data of cortical thickness, acquired from the brains of different subjects, and thus lying on domains with different geometries.
In this work we define a spatial concordance coefficient for second-order stationary processes. This problem has been widely addressed in a non-spatial context, but here we consider a coefficient that for a fixed spatial lag allows one to compare two spatial sequences along a 45-degree line. The proposed coefficient was explored for the bivariate Matern and Wendland covariance functions. The asymptotic normality of a sample version of the spatial concordance coefficient for an increasing domain sampling framework was established for the Wendland covariance function. To work with large digital images, we developed a local approach for estimating the concordance that uses local spatial models on non-overlapping windows. Monte Carlo simulations were used to gain additional insights into the asymptotic properties for finite sample sizes. As an illustrative example, we applied this methodology to two similar images of a deciduous forest canopy. The images were recorded with different cameras but similar fields-of-view and within minutes of each other. Our analysis showed that the local approach helped to explain a percentage of the non-spatial concordance and to provided additional information about its decay as a function of the spatial lag.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا