No Arabic abstract
In network analysis, many community detection algorithms have been developed, however, their implementation leaves unaddressed the question of the statistical validation of the results. Here we present robin(ROBustness In Network), an R package to assess the robustness of the community structure of a network found by one or more methods to give indications about their reliability. The procedure initially detects if the community structure found by a set of algorithms is statistically significant and then compares two selected detection algorithms on the same graph to choose the one that better fits the network of interest. We demonstrate the use of our package on the American College Football benchmark dataset.
Pooled testing (also known as group testing), where diagnostic tests are performed on pooled samples, has broad applications in the surveillance of diseases in animals and humans. An increasingly common use case is molecular xenomonitoring (MX), where surveillance of vector-borne diseases is conducted by capturing and testing large numbers of vectors (e.g. mosquitoes). The R package PoolTestR was developed to meet the needs of increasingly large and complex molecular xenomonitoring surveys but can be applied to analyse any data involving pooled testing. PoolTestR includes simple and flexible tools to estimate prevalence and fit fixed- and mixed-effect generalised linear models for pooled data in frequentist and Bayesian frameworks. Mixed-effect models allow users to account for the hierarchical sampling designs that are often employed in surveys, including MX. We demonstrate the utility of PoolTestR by applying it to a large synthetic dataset that emulates a MX survey with a hierarchical sampling design.
We introduce and illustrate through numerical examples the R package texttt{SIHR} which handles the statistical inference for (1) linear and quadratic functionals in the high-dimensional linear regression and (2) linear functional in the high-dimensional logistic regression. The focus of the proposed algorithms is on the point estimation, confidence interval construction and hypothesis testing. The inference methods are extended to multiple regression models. We include real data applications to demonstrate the packages performance and practicality.
SDRcausal is a package that implements sufficient dimension reduction methods for causal inference as proposed in Ghosh, Ma, and de Luna (2021). The package implements (augmented) inverse probability weighting and outcome regression (imputation) estimators of an average treatment effect (ATE) parameter. Nuisance models, both treatment assignment probability given the covariates (propensity score) and outcome regression models, are fitted by using semiparametric locally efficient dimension reduction estimators, thereby allowing for large sets of confounding covariates. Techniques including linear extrapolation, numerical differentiation, and truncation have been used to obtain a practicable implementation of the methods. Finding the suitable dimension reduction map (central mean subspace) requires solving an optimization problem, and several optimization algorithms are given as choices to the user. The package also provides estimators of the asymptotic variances of the causal effect estimators implemented. Plotting options are provided. The core of the methods are implemented in C language, and parallelization is allowed for. The user-friendly and freeware R language is used as interface. The package can be downloaded from Github repository: https://github.com/stat4reg.
The R package sns implements Stochastic Newton Sampler (SNS), a Metropolis-Hastings Monte Carlo Markov Chain algorithm where the proposal density function is a multivariate Gaussian based on a local, second-order Taylor series expansion of log-density. The mean of the proposal function is the full Newton step in Newton-Raphson optimization algorithm. Taking advantage of the local, multivariate geometry captured in log-density Hessian allows SNS to be more efficient than univariate samplers, approaching independent sampling as the density function increasingly resembles a multivariate Gaussian. SNS requires the log-density Hessian to be negative-definite everywhere in order to construct a valid proposal function. This property holds, or can be easily checked, for many GLM-like models. When initial point is far from density peak, running SNS in non-stochastic mode by taking the Newton step, augmented with with line search, allows the MCMC chain to converge to high-density areas faster. For high-dimensional problems, partitioning of state space into lower-dimensional subsets, and applying SNS to the subsets within a Gibbs sampling framework can significantly improve the mixing of SNS chains. In addition to the above strategies for improving convergence and mixing, sns offers diagnostics and visualization capabilities, as well as a function for sample-based calculation of Bayesian predictive posterior distributions.
This paper proposes consistent and asymptotically Gaussian estimators for the drift, the diffusion coefficient and the Hurst exponent of the discretely observed fractional Ornstein-Uhlenbeck process. For the estimation of the drift, the results are obtained only in the case when 1/2 < H < 3/4. This paper also provides ready-to-use software for the R statistical environment based on the YUIMA package.