No Arabic abstract
Background and objective: The stepped wedge cluster randomized trial is a study design increasingly used for public health intervention evaluations. Most previous literature focuses on power calculations for this particular type of cluster randomized trials for continuous outcomes, along with an approximation to this approach for binary outcomes. Although not accurate for binary outcomes, it has been widely used. To improve the approximation for binary outcomes, two new methods for stepped wedge designs (SWDs) of binary outcomes have recently been published. However, these new methods have not been implemented in publicly available software. The objective of this paper is to present power calculation software for SWDs in various settings for both continuous and binary outcomes. Methods: We have developed a SAS macro %swdpwr and an R package swdpwr for power calculation in SWDs. Different scenarios including cross-sectional and cohort designs, binary and continuous outcomes, marginal and conditional models, three link functions, with and without time effects are accommodated in this software. Results: swdpwr provides an efficient tool to support investigators in the design and analysis of stepped wedge cluster randomized trails. swdpwr addresses the implementation gap between newly proposed methodology and their application to obtain more accurate power calculations in SWDs. Conclusions: This user-friendly software makes the new methods more accessible and incorporates as many variations as currently available, which were not supported in other related packages. swdpwr is implemented under two platforms: SAS and R, satisfying the needs of investigators from various backgrounds.
Modeling the diameter distribution of trees in forest stands is a common forestry task that supports key biologically and economically relevant management decisions. The choice of model used to represent the diameter distribution and how to estimate its parameters has received much attention in the forestry literature; however, accessible software that facilitates comprehensive comparison of the myriad modeling approaches is not available. To this end, we developed an R package called ForestFit that simplifies estimation of common probability distributions used to model tree diameter distributions, including the two- and three-parameter Weibull distributions, Johnsons SB distribution, Birnbaum-Saunders distribution, and finite mixture distributions. Frequentist and Bayesian techniques are provided for individual tree diameter data, as well as grouped data. Additional functionality facilitates fitting growth curves to height-diameter data. The package also provides a set of functions for computing probability distributions and simulating random realizations from common finite mixture models.
In network analysis, many community detection algorithms have been developed, however, their implementation leaves unaddressed the question of the statistical validation of the results. Here we present robin(ROBustness In Network), an R package to assess the robustness of the community structure of a network found by one or more methods to give indications about their reliability. The procedure initially detects if the community structure found by a set of algorithms is statistically significant and then compares two selected detection algorithms on the same graph to choose the one that better fits the network of interest. We demonstrate the use of our package on the American College Football benchmark dataset.
In this article, we develop methods for sample size and power calculations in four-level intervention studies when intervention assignment is carried out at any level, with a particular focus on cluster randomized trials (CRTs). CRTs involving four levels are becoming popular in health care research, where the effects are measured, for example, from evaluations (level 1) within participants (level 2) in divisions (level 3) that are nested in clusters (level 4). In such multi-level CRTs, we consider three types of intraclass correlations between different evaluations to account for such clustering: that of the same participant, that of different participants from the same division, and that of different participants from different divisions in the same cluster. Assuming arbitrary link and variance functions, with the proposed correlation structure as the true correlation structure, closed-form sample size formulas for randomization carried out at any level (including individually randomized trials within a four-level clustered structure) are derived based on the generalized estimating equations approach using the model-based variance and using the sandwich variance with an independence working correlation matrix. We demonstrate that empirical power corresponds well with that predicted by the proposed method for as few as 8 clusters, when data are analyzed using the matrix-adjusted estimating equations for the correlation parameters with a bias-corrected sandwich variance estimator, under both balanced and unbalanced designs.
Pooled testing (also known as group testing), where diagnostic tests are performed on pooled samples, has broad applications in the surveillance of diseases in animals and humans. An increasingly common use case is molecular xenomonitoring (MX), where surveillance of vector-borne diseases is conducted by capturing and testing large numbers of vectors (e.g. mosquitoes). The R package PoolTestR was developed to meet the needs of increasingly large and complex molecular xenomonitoring surveys but can be applied to analyse any data involving pooled testing. PoolTestR includes simple and flexible tools to estimate prevalence and fit fixed- and mixed-effect generalised linear models for pooled data in frequentist and Bayesian frameworks. Mixed-effect models allow users to account for the hierarchical sampling designs that are often employed in surveys, including MX. We demonstrate the utility of PoolTestR by applying it to a large synthetic dataset that emulates a MX survey with a hierarchical sampling design.
We introduce and illustrate through numerical examples the R package texttt{SIHR} which handles the statistical inference for (1) linear and quadratic functionals in the high-dimensional linear regression and (2) linear functional in the high-dimensional logistic regression. The focus of the proposed algorithms is on the point estimation, confidence interval construction and hypothesis testing. The inference methods are extended to multiple regression models. We include real data applications to demonstrate the packages performance and practicality.