New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Box-Cox symmetric distributions and applications to nutritional data

67 0 0.0 ( 0 )

Download Cite

Added by Giovana Fumes

Publication date 2016

fields Mathematical Statistics

and research's language is English

Authors Silvia L. P. Ferrari - Giovana Fumes

Other Statistics Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We introduce the Box-Cox symmetric class of distributions, which is useful for modeling positively skewed, possibly heavy-tailed, data. The new class of distributions includes the Box-Cox t, Box-Cox Cole-Gree, Box-Cox power exponential distributions, and the class of the log-symmetric distributions as special cases. It provides easy parameter interpretation, which makes it convenient for regression modeling purposes. Additionally, it provides enough flexibility to handle outliers. The usefulness of the Box-Cox symmetric models is illustrated in applications to nutritional data.

rate research

Box-Cox elliptical distributions with application

60 - Raul Alejandro Moran-Vasquez , Silvia L. P. Ferrari 2017

We propose and study the class of Box-Cox elliptical distributions. It provides alternative distributions for modeling multivariate positive, marginally skewed and possibly heavy-tailed data. This new class of distributions has as a special case the class of log-elliptical distributions, and reduces to the Box-Cox symmetric class of distributions in the univariate setting. The parameters are interpretable in terms of quantiles and relative dispersions of the marginal distributions and of associations between pairs of variables. The relation between the scale parameters and quantiles makes the Box-Cox elliptical distributions attractive for regression modeling purposes. Applications to data on vitamin intake are presented and discussed.

Methodology

$L_p$-nested symmetric distributions

103 - Fabian Sinz , Matthias Bethge 2010

Tractable generalizations of the Gaussian distribution play an important role for the analysis of high-dimensional data. One very general super-class of Normal distributions is the class of $ u$-spherical distributions whose random variables can be represented as the product $x = rcdot u$ of a uniformly distribution random variable $u$ on the $1$-level set of a positively homogeneous function $ u$ and arbitrary positive radial random variable $r$. Prominent subclasses of $ u$-spherical distributions are spherically symmetric distributions ($ u(x)=|x|_2$) which have been further generalized to the class of $L_p$-spherically symmetric distributions ($ u(x)=|x|_p$). Both of these classes contain the Gaussian as a special case. In general, however, $ u$-spherical distributions are computationally intractable since, for instance, the normalization constant or fast sampling algorithms are unknown for an arbitrary $ u$. In this paper we introduce a new subclass of $ u$-spherical distributions by choosing $ u$ to be a nested cascade of $L_p$-norms. This class is still computationally tractable, but includes all the aforementioned subclasses as a special case. We derive a general expression for $L_p$-nested symmetric distributions as well as the uniform distribution on the $L_p$-nested unit sphere, including an explicit expression for the normalization constant. We state several general properties of $L_p$-nested symmetric distributions, investigate its marginals, maximum likelihood fitting and discuss its tight links to well known machine learning methods such as Independent Component Analysis (ICA), Independent Subspace Analysis (ISA) and mixed norm regularizers. Finally, we derive a fast and exact sampling algorithm for arbitrary $L_p$-nested symmetric distributions, and introduce the Nested Radial Factorization algorithm (NRF), which is a form of non-linear ICA.

Other Statistics

Comparative analysis of the original and amplitude permutations

115 - Wenpo Yao , Wenli Yao 2021

We compare the two basic ordinal patterns, i.e., the original and amplitude permutations, used to characterize vector structures. The original permutation consists of the indexes of reorganized values in the original vector. By contrast, the amplitude permutation comprises the positions of values in the reordered vector, and it directly reflects the temporal structure. To accurately convey the structural characteristics of vectors, we modify indexes of equal values in permutations to be the same as, for example, the smallest or largest indexes in each group of equalities. Overall, we clarify the relationship between the original and amplitude permutations. And the results have implications for time- and amplitude-symmetric vectors and will lead to further theoretical and experimental studies.

Other Statistics Methodology

Deep Cox Mixtures for Survival Regression

404 - Chirag Nagpal , Steve Yadlowsky , Negar Rostamzadeh 2021

Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.

Machine Learning Methodology Machine Learning

Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications

140 - Jianqing Fan , Qi-Man Shao , Wen-Xin Zhou 2015

Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries from these data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions about the exogeneity of the covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distributions of the maximum spurious correlations given a certain number of predictors, namely, the distribution of the correlation of a response variable $Y$ with the best $s$ linear combinations of $p$ covariates $mathbf{X}$, even when $mathbf{X}$ and $Y$ are independent. When the covariance matrix of $mathbf{X}$ possesses the restricted eigenvalue property, we derive such distributions for both a finite $s$ and a diverging $s$, using Gaussian approximation and empirical process techniques. However, such a distribution depends on the unknown covariance matrix of $mathbf{X}$. Hence, we use the multiplier bootstrap procedure to approximate the unknown distributions and establish the consistency of such a simple bootstrap approach. The results are further extended to the situation where the residuals are from regularized fits. Our approach is then used to construct the upper confidence limit for the maximum spurious correlation and to test the exogeneity of the covariates. The former provides a baseline for guarding against false discoveries and the latter tests whether our fundamental assumptions for high-dimensional model selection are statistically valid. Our techniques and results are illustrated with both numerical examples and real data analysis.

Statistics Theory Methodology Statistics Theory

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Box-Cox symmetric distributions and applications to nutritional data

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions