No Arabic abstract
Normal copula with a correlation coefficient between $-1$ and $1$ is tail independent and so it severely underestimates extreme probabilities. By letting the correlation coefficient in a normal copula depend on the sample size, Husler and Reiss (1989) showed that the tail can become asymptotically dependent. In this paper, we extend this result by deriving the limit of the normalized maximum of $n$ independent observations, where the $i$-th observation follows from a normal copula with its correlation coefficient being either a parametric or a nonparametric function of $i/n$. Furthermore, both parametric and nonparametric inference for this unknown function are studied, which can be employed to test the condition in Husler and Reiss (1989). A simulation study and real data analysis are presented too.
A new bivariate copula is proposed for modeling negative dependence between two random variables. We show that it complies with most of the popular notions of negative dependence reported in the literature and study some of its basic properties. Specifically, the Spearmans rho and the Kendalls tau for the proposed copula have a simple one-parameter form with negative values in the full range. Some important ordering properties comparing the strength of negative dependence with respect to the parameter involved are considered. Simple examples of the corresponding bivariate distributions with popular marginals are presented. Application of the proposed copula is illustrated using a real data set.
Bivariate normal distributions are often used to describe the joint probability density of a pair of random variables. These distributions arise across many domains, from telecommunications, to meteorology, ballistics, and computational neuroscience. In these applications, it is often useful to radially and angularly marginalize (i.e.,~under a polar transformation) the joint probability distribution relative to the coordinate systems origin. This marginalization is trivial for a zero-mean, isotropic distribution, but is non-trivial for the most general case of a non-zero-mean, anisotropic distribution with a non-diagonal covariance matrix. Across domains, a range of solutions with varying degrees of generality have been derived. Here, we provide a concise summary of analytic solutions for the polar marginalization of bivariate normal distributions. This report accompanies a Matlab (Mathworks, Inc.) and R toolbox that provides closed-form and numeric implementations for the marginalizations described herein.
Quantile regression, that is the prediction of conditional quantiles, has steadily gained importance in statistical modeling and financial applications. The authors introduce a new semiparametric quantile regression method based on sequentially fitting a likelihood optimal D-vine copula to given data resulting in highly flexible models with easily extractable conditional quantiles. As a subclass of regular vine copulas, D-vines enable the modeling of multivariate copulas in terms of bivariate building blocks, a so-called pair-copula construction (PCC). The proposed algorithm works fast and accurate even in high dimensions and incorporates an automatic variable selection by maximizing the conditional log-likelihood. Further, typical issues of quantile regression such as quantile crossing or transformations, interactions and collinearity of variables are automatically taken care of. In a simulation study the improved accuracy and saved computational time of the approach in comparison with established quantile regression methods is highlighted. An extensive financial application to international credit default swap (CDS) data including stress testing and Value-at-Risk (VaR) prediction demonstrates the usefulness of the proposed method.
Blocking is often used to reduce known variability in designed experiments by collecting together homogeneous experimental units. A common modelling assumption for such experiments is that responses from units within a block are dependent. Accounting for such dependencies in both the design of the experiment and the modelling of the resulting data when the response is not normally distributed can be challenging, particularly in terms of the computation required to find an optimal design. The application of copulas and marginal modelling provides a computationally efficient approach for estimating population-average treatment effects. Motivated by an experiment from materials testing, we develop and demonstrate designs with blocks of size two using copula models. Such designs are also important in applications ranging from microarray experiments to experiments on human eyes or limbs with naturally occurring blocks of size two. We present methodology for design selection, make comparisons to existing approaches in the literature and assess the robustness of the designs to modelling assumptions.
In recent biomedical scientific problems, it is a fundamental issue to integratively cluster a set of objects from multiple sources of datasets. Such problems are mostly encountered in genomics, where data is collected from various sources, and typically represent distinct yet complementary information. Integrating these data sources for multi-source clustering is challenging due to their complex dependence structure including directional dependency. Particularly in genomics studies, it is known that there is certain directional dependence between DNA expression, DNA methylation, and RNA expression, widely called The Central Dogma. Most of the existing multi-view clustering methods either assume an independent structure or pair-wise (non-directional) dependency, thereby ignoring the directional relationship. Motivated by this, we propose a copula-based multi-view clustering model where a copula enables the model to accommodate the directional dependence existing in the datasets. We conduct a simulation experiment where the simulated datasets exhibiting inherent directional dependence: it turns out that ignoring the directional dependence negatively affects the clustering performance. As a real application, we applied our model to the breast cancer tumor samples collected from The Cancer Genome Altas (TCGA).