Data Mining in Large Frequency Tables With Ontology, with an Application to the Vaccine Adverse Event Reporting System


Abstract in English

Vaccine safety is a concerning problem of the public, and many signal detecting methods have been developed to identify relative risks between vaccines and adverse events (AEs). Those methods usually focus on individual AEs, where the randomness of data is high. The results often turn out to be inaccurate and lack of clinical meaning. The AE ontology contains information about biological similarity of AEs. Based on this, we extend the concept of relative risks (RRs) to AE group level, which allows the possibility of more accurate and meaningful estimation by utilizing data from the whole group. In this paper, we propose the method zGPS.AO (Zero Inflated Gamma Poisson Shrinker with AE ontology) based on the zero inflated negative binomial distribution. This model has two purples: a regression model estimating group level RRs, and a empirical bayes framework to evaluate AE level RRs. The regression part can handle both excess zeros and over dispersion in the data, and the empirical method borrows information from both group level and AE level to reduce data noise and stabilize the AE level result. We have demonstrate the unbiaseness and low variance features of our model with simulated data, and obtained meaningful results coherent with previous studies on the VAERS (Vaccine Adverse Event Reporting System) database. The proposed methods are implemented in the R package zGPS.AO, which can be installed from the Comprehensive R Archive Network, CRAN. The results on VAERS data are visualized using the interactive web app Rshiny.

Download