Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Accurate estimation of influenza epidemics using Google search data via ARGO

363 0 0.0 ( 0 )

Download Cite

Added by Shihao Yang

Publication date 2015

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Shihao Yang - Mauricio Santillana -

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly available online search data. In addition to having a rigorous statistical foundation, ARGO outperforms all previously available Google-search-based tracking models, including the latest version of Google Flu Trends, even though it uses only low-quality search data as input from publicly available Google Trends and Google Correlate websites. ARGO not only incorporates the seasonality in influenza epidemics but also captures changes in peoples online search behavior over time. ARGO is also flexible, self-correcting, robust, and scalable, making it a potentially powerful tool that can be used for real-time tracking of other social events at multiple temporal and spatial resolutions.

rate research

Use Internet Search Data to Accurately Track State-Level Influenza Epidemics

226 - Shihao Yang , Shaoyang Ning , S. C. Kou 2020

For epidemics control and prevention, timely insights of potential hot spots are invaluable. Alternative to traditional epidemic surveillance, which often lags behind real time by weeks, big data from the Internet provide important information of the current epidemic trends. Here we present a methodology, ARGOX (Augmented Regression with GOogle data CROSS space), for accurate real-time tracking of state-level influenza epidemics in the United States. ARGOX combines Internet search data at the national, regional and state levels with traditional influenza surveillance data from the Centers for Disease Control and Prevention, and accounts for both the spatial correlation structure of state-level influenza activities and the evolution of peoples Internet search pattern. ARGOX achieves on average 28% error reduction over the best alternative for real-time state-level influenza estimation for 2014 to 2020. ARGOX is robust and reliable and can be potentially applied to track county- and city-level influenza activity and other infectious diseases.

Applications

Sequence to Sequence with Attention for Influenza Prevalence Prediction using Google Trends

265 - Kenjiro Kondo , Akihiko Ishikawa , Masashi Kimura 2019

Early prediction of the prevalence of influenza reduces its impact. Various studies have been conducted to predict the number of influenza-infected people. However, these studies are not highly accurate especially in the distant future such as over one month. To deal with this problem, we investigate the sequence to sequence (Seq2Seq) with attention model using Google Trends data to assess and predict the number of influenza-infected people over the course of multiple weeks. Google Trends data help to compensate the dark figures including the statistics and improve the prediction accuracy. We demonstrate that the attention mechanism is highly effective to improve prediction accuracy and achieves state-of-the art results, with a Pearson correlation and root-mean-square error of 0.996 and 0.67, respectively. However, the prediction accuracy of the peak of influenza epidemic is not sufficient, and further investigation is needed to overcome this problem.

Machine Learning Social and Information Networks

Accurate Assessment via Process Data

68 - Susu Zhang , Zhi Wang , Jitong Qi 2021

Accurate assessment of students ability is the key task of a test. Assessments based on final responses are the standard. As the infrastructure advances, substantially more information is observed. One of such instances is the process data that is collected by computer-based interactive items, which contain a students detailed interactive processes. In this paper, we show both theoretically and empirically that appropriately including such information in the assessment will substantially improve relevant assessment precision. The precision is measured empirically by out-of-sample test reliability.

Applications

Forecasting unemployment using Internet search data via PRISM

119 - Dingdong Yi , Shaoyang Ning , Chia-Jung Chang 2020

Big data generated from the Internet offer great potential for predictive analysis. Here we focus on using online users Internet search data to forecast unemployment initial claims weeks into the future, which provides timely insights into the direction of the economy. To this end, we present a novel method PRISM (Penalized Regression with Inferred Seasonality Module), which uses publicly available online search data from Google. PRISM is a semi-parametric method, motivated by a general state-space formulation, and employs nonparametric seasonal decomposition and penalized regression. For forecasting unemployment initial claims, PRISM outperforms all previously available methods, including forecasting during the 2008-2009 financial crisis period and near-future forecasting during the COVID-19 pandemic period, when unemployment initial claims both rose rapidly. The timely and accurate unemployment forecasts by PRISM could aid government agencies and financial institutions to assess the economic trend and make well-informed decisions, especially in the face of economic turbulence.

Applications Methodology

Estimating influenza incidence using search query deceptiveness and generalized ridge regression

125 - Reid Priedhorsky 2019

Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data. These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression. We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates.

Populations and Evolution Social and Information Networks Applications

comments

Fetching comments

Higher Institute for Applied Sciences and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Accurate estimation of influenza epidemics using Google search data via ARGO

Ask ChatGPT about the research

No Arabic abstract

Read More