No Arabic abstract
Previous research has demonstrated that various properties of infectious diseases can be inferred from online search behaviour. In this work we use time series of online search query frequencies to gain insights about the prevalence of COVID-19 in multiple countries. We first develop unsupervised modelling techniques based on associated symptom categories identified by the United Kingdoms National Health Service and Public Health England. We then attempt to minimise an expected bias in these signals caused by public interest -- as opposed to infections -- using the proportion of news media coverage devoted to COVID-19 as a proxy indicator. Our analysis indicates that models based on online searches precede the reported confirmed cases and deaths by 16.7 (10.2 - 23.2) and 22.1 (17.4 - 26.9) days, respectively. We also investigate transfer learning techniques for mapping supervised models from countries where the spread of disease has progressed extensively to countries that are in earlier phases of their respective epidemic curves. Furthermore, we compare time series of online search activity against confirmed COVID-19 cases or deaths jointly across multiple countries, uncovering interesting querying patterns, including the finding that rarer symptoms are better predictors than common ones. Finally, we show that web searches improve the short-term forecasting accuracy of autoregressive models for COVID-19 deaths. Our work provides evidence that online search data can be used to develop complementary public health surveillance methods to help inform the COVID-19 response in conjunction with more established approaches.
The ongoing, fluid nature of the COVID-19 pandemic requires individuals to regularly seek information about best health practices, local community spreading, and public health guidelines. In the absence of a unified response to the pandemic in the United States and clear, consistent directives from federal and local officials, people have used social media to collectively crowdsource COVID-19 elites, a small set of trusted COVID-19 information sources. We take a census of COVID-19 crowdsourced elites in the United States who have received sustained attention on Twitter during the pandemic. Using a mixed methods approach with a panel of Twitter users linked to public U.S. voter registration records, we find that journalists, media outlets, and political accounts have been consistently amplified around COVID-19, while epidemiologists, public health officials, and medical professionals make up only a small portion of all COVID-19 elites on Twitter. We show that COVID-19 elites vary considerably across demographic groups, and that there are notable racial, geographic, and political similarities and disparities between various groups and the demographics of their elites. With this variation in mind, we discuss the potential for using the disproportionate online voice of crowdsourced COVID-19 elites to equitably promote timely public health information and mitigate rampant misinformation.
The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified importance of political ideology in shaping perceptions of the pandemic and compliance with preventive measures. Here, we use social media data to study complexity of polarization. We analyze a large dataset of tweets related to the pandemic collected between January and May of 2020, and develop methods to classify the ideological alignment of users along the moderacy (hardline vs moderate), political (liberal vs conservative) and science (anti-science vs pro-science) dimensions. While polarization along the science and political dimensions are correlated, politically moderate users are more likely to be aligned with the pro-science views, and politically hardline users with anti-science views. Contrary to expectations, we do not find that polarization grows over time; instead, we see increasing activity by moderate pro-science users. We also show that anti-science conservatives tend to tweet from the Southern US, while anti-science moderates from the Western states. Our findings shed light on the multi-dimensional nature of polarization, and the feasibility of tracking polarized opinions about the pandemic across time and space through social media data.
As the COVID-19 pandemic is disrupting life worldwide, related online communities are popping up. In particular, two new communities, /r/China flu and /r/Coronavirus, emerged on Reddit and have been dedicated to COVID- related discussions from the very beginning of this pandemic. With /r/Coronavirus promoted as the official community on Reddit, it remains an open question how users choose between these two highly-related communities. In this paper, we characterize user trajectories in these two communities from the beginning of COVID-19 to the end of September 2020. We show that new users of /r/China flu and /r/Coronavirus were similar from January to March. After that, their differences steadily increase, evidenced by both language distance and membership prediction, as the pandemic continues to unfold. Furthermore, users who started at /r/China flu from January to March were more likely to leave, while those who started in later months tend to remain highly loyal. To understand this difference, we develop a movement analysis framework to understand membership changes in these two communities and identify a significant proportion of /r/China flu members (around 50%) that moved to /r/Coronavirus in February. This movement turns out to be highly predictable based on other subreddits that users were previously active in. Our work demonstrates how two highly-related communities emerge and develop their own identity in a crisis, and highlights the important role of existing communities in understanding such an emergence.
We address the diffusion of information about the COVID-19 with a massive data analysis on Twitter, Instagram, YouTube, Reddit and Gab. We analyze engagement and interest in the COVID-19 topic and provide a differential assessment on the evolution of the discourse on a global scale for each platform and their users. We fit information spreading with epidemic models characterizing the basic reproduction numbers $R_0$ for each social media platform. Moreover, we characterize information spreading from questionable sources, finding different volumes of misinformation in each platform. However, information from both reliable and questionable sources do not present different spreading patterns. Finally, we provide platform-dependent numerical estimates of rumors amplification.
The outbreak of COVID-19 highlights the need for a more harmonized, less privacy-concerning, easily accessible approach to monitoring the human mobility that has been proved to be associated with the viral transmission. In this study, we analyzed 587 million tweets worldwide to see how global collaborative efforts in reducing human mobility are reflected from the user-generated information at the global, country, and the U.S. state scale. Considering the multifaceted nature of mobility, we propose two types of distance: the single-day distance and the cross-day distance. To quantify the responsiveness in certain geographical regions, we further propose a mobility-based responsive index (MRI) that captures the overall degree of mobility changes within a time window. The results suggest that mobility patterns obtained from Twitter data are amendable to quantitatively reflect the mobility dynamics. Globally, the proposed two distances had greatly deviated from their baselines after March 11, 2020, when WHO declared COVID-19 as a pandemic. The considerably less periodicity after the declaration suggests that the protection measures have obviously affected peoples travel routines. The country scale comparisons reveal the discrepancies in responsiveness, evidenced by the contrasting mobility patterns in different epidemic phases. We find that the triggers of mobility changes correspond well with the national announcements of mitigation measures. In the U.S., the influence of the COVID-19 pandemic on mobility is distinct. However, the impacts varied substantially among states. The strong mobility recovering momentum is further fueled by the Black Lives Matter protests, potentially fostering the second wave of infections in the U.S.