No Arabic abstract
The linear exponential distribution is a generalization of the exponential and Rayleigh distributions. This distribution is one of the best models to fit data with increasing failure rate (IFR). But it does not provide a reasonable fit for modeling data with decreasing failure rate (DFR) and bathtub shaped failure rate (BTFR). To overcome this drawback, we propose a new record-based transmuted generalized linear exponential (RTGLE) distribution by using the technique of Balakrishnan and He (2021). The family of RTGLE distributions is more flexible to fit the data sets with IFR, DFR, and BTFR, and also generalizes several well-known models as well as some new record-based transmuted models. This paper aims to study the statistical properties of RTGLE distribution, like, the shape of the probability density function and hazard function, quantile function and its applications, moments and its generating function, order and record statistics, Renyi entropy. The maximum likelihood estimators, least squares and weighted least squares estimators, Anderson-Darling estimators, Cramer-von Mises estimators of the unknown parameters are constructed and their biases and mean squared errors are reported via Monte Carlo simulation study. Finally, the real data set based on failure time illustrates the goodness of fit and applicability of the proposed distribution; hence, suitable recommendations are forwarded.
In this paper, we introduce a new three-parameter distribution based on the combination of re-parametrization of the so-called EGNB2 and transmuted exponential distributions. This combination aims to modify the transmuted exponential distribution via the incorporation of an additional parameter, mainly adding a high degree of flexibility on the mode and impacting the skewness and kurtosis of the tail. We explore some mathematical properties of this distribution including the hazard rate function, moments, the moment generating function, the quantile function, various entropy measures and (reversed) residual life functions. A statistical study investigates estimation of the parameters using the method of maximum likelihood. The distribution along with other existing distributions are fitted to two environmental data sets and its superior performance is assessed by using some goodness-of-fit tests. As a result, some environmental measures associated with these data are obtained such as the return level and mean deviation about this level.
Modelling edge weights play a crucial role in the analysis of network data, which reveals the extent of relationships among individuals. Due to the diversity of weight information, sharing these data has become a complicated challenge in a privacy-preserving way. In this paper, we consider the case of the non-denoising process to achieve the trade-off between privacy and weight information in the generalized $beta$-model. Under the edge differential privacy with a discrete Laplace mechanism, the Z-estimators from estimating equations for the model parameters are shown to be consistent and asymptotically normally distributed. The simulations and a real data example are given to further support the theoretical results.
In record linkage (RL), or exact file matching, the goal is to identify the links between entities with information on two or more files. RL is an important activity in areas including counting the population, enhancing survey frames and data, and conducting epidemiological and follow-up studies. RL is challenging when files are very large, no accurate personal identification (ID) number is present on all files for all units, and some information is recorded with error. Without an unique ID number one must rely on comparisons of names, addresses, dates, and other information to find the links. Latent class models can be used to automatically score the value of information for determining match status. Data for fitting models come from comparisons made within groups of units that pass initial file blocking requirements. Data distributions can vary across blocks. This article examines the use of prior information and hierarchical latent class models in the context of RL.
Bayesian posterior distributions are widely used for inference, but their dependence on a statistical model creates some challenges. In particular, there may be lots of nuisance parameters that require prior distributions and posterior computations, plus a potentially serious risk of model misspecification bias. Gibbs posterior distributions, on the other hand, offer direct, principled, probabilistic inference on quantities of interest through a loss function, not a model-based likelihood. Here we provide simple sufficient conditions for establishing Gibbs posterior concentration rates when the loss function is of a sub-exponential type. We apply these general results in a range of practically relevant examples, including mean regression, quantile regression, and sparse high-dimensional classification. We also apply these techniques in an important problem in medical statistics, namely, estimation of a personalized minimum clinically important difference.
In many applications, the dataset under investigation exhibits heterogeneous regimes that are more appropriately modeled using piece-wise linear models for each of the data segments separated by change-points. Although there have been much work on change point linear regression for the low dimensional case, high-dimensional change point regression is severely underdeveloped. Motivated by the analysis of Minnesota House Price Index data, we propose a fully Bayesian framework for fitting changing linear regression models in high-dimensional settings. Using segment-specific shrinkage and diffusion priors, we deliver full posterior inference for the change points and simultaneously obtain posterior probabilities of variable selection in each segment via an efficient Gibbs sampler. Additionally, our method can detect an unknown number of change points and accommodate different variable selection constraints like grouping or partial selection. We substantiate the accuracy of our method using simulation experiments for a wide range of scenarios. We apply our approach for a macro-economic analysis of Minnesota house price index data. The results strongly favor the change point model over a homogeneous (no change point) high-dimensional regression model.