No Arabic abstract
COVID-19 has affected the world economy and the daily life routine of almost everyone. It has been a hot topic on social media platforms such as Twitter, Facebook, etc. These social media platforms enable users to share information with other users who can reshare this information, thus causing this information to spread. Twitters retweet functionality allows users to share the existing content with other users without altering the original content. Analysis of social media platforms can help in detecting emergencies during pandemics that lead to taking preventive measures. One such type of analysis is predicting the number of retweets for a given COVID-19 related tweet. Recently, CIKM organized a retweet prediction challenge for COVID-19 tweets focusing on using numeric features only. However, our hypothesis is, tweet text may play a vital role in an accurate retweet prediction. In this paper, we combine numeric and text features for COVID-19 related retweet predictions. For this purpose, we propose two CNN and RNN based models and evaluate the performance of these models on a publicly available TweetsCOV19 dataset using seven different evaluation metrics. Our evaluation results show that combining tweet text with numeric features improves the performance of retweet prediction significantly.
As the COVID-19 pandemic is disrupting life worldwide, related online communities are popping up. In particular, two new communities, /r/China flu and /r/Coronavirus, emerged on Reddit and have been dedicated to COVID- related discussions from the very beginning of this pandemic. With /r/Coronavirus promoted as the official community on Reddit, it remains an open question how users choose between these two highly-related communities. In this paper, we characterize user trajectories in these two communities from the beginning of COVID-19 to the end of September 2020. We show that new users of /r/China flu and /r/Coronavirus were similar from January to March. After that, their differences steadily increase, evidenced by both language distance and membership prediction, as the pandemic continues to unfold. Furthermore, users who started at /r/China flu from January to March were more likely to leave, while those who started in later months tend to remain highly loyal. To understand this difference, we develop a movement analysis framework to understand membership changes in these two communities and identify a significant proportion of /r/China flu members (around 50%) that moved to /r/Coronavirus in February. This movement turns out to be highly predictable based on other subreddits that users were previously active in. Our work demonstrates how two highly-related communities emerge and develop their own identity in a crisis, and highlights the important role of existing communities in understanding such an emergence.
The COVID-19 pandemic has affected peoples lives around the world on an unprecedented scale. We intend to investigate hoarding behaviors in response to the pandemic using large-scale social media data. First, we collect hoarding-related tweets shortly after the outbreak of the coronavirus. Next, we analyze the hoarding and anti-hoarding patterns of over 42,000 unique Twitter users in the United States from March 1 to April 30, 2020, and dissect the hoarding-related tweets by age, gender, and geographic location. We find the percentage of females in both hoarding and anti-hoarding groups is higher than that of the general Twitter user population. Furthermore, using topic modeling, we investigate the opinions expressed towards the hoarding behavior by categorizing these topics according to demographic and geographic groups. We also calculate the anxiety scores for the hoarding and anti-hoarding related tweets using a lexical approach. By comparing their anxiety scores with the baseline Twitter anxiety score, we reveal further insights. The LIWC anxiety mean for the hoarding-related tweets is significantly higher than the baseline Twitter anxiety mean. Interestingly, beer has the highest calculated anxiety score compared to other hoarded items mentioned in the tweets.
Successful navigation of the Covid-19 pandemic is predicated on public cooperation with safety measures and appropriate perception of risk, in which emotion and attention play important roles. Signatures of public emotion and attention are present in social media data, thus natural language analysis of this text enables near-to-real-time monitoring of indicators of public risk perception. We compare key epidemiological indicators of the progression of the pandemic with indicators of the public perception of the pandemic constructed from ~20 million unique Covid-19-related tweets from 12 countries posted between 10th March -- 14th June 2020. We find evidence of psychophysical numbing: Twitter users increasingly fixate on mortality, but in a decreasingly emotional and increasingly analytic tone. Semantic network analysis based on word co-occurrences reveals changes in the emotional framing of Covid-19 casualties that are consistent with this hypothesis. We also find that the average attention afforded to national Covid-19 mortality rates is modelled accurately with the Weber-Fechner and power law functions of sensory perception. Our parameter estimates for these models are consistent with estimates from psychological experiments, and indicate that users in this dataset exhibit differential sensitivity by country to the national Covid-19 death rates. Our work illustrates the potential utility of social media for monitoring public risk perception and guiding public communication during crisis scenarios.
The objective of the study is to examine coronavirus disease (COVID-19) related discussions, concerns, and sentiments that emerged from tweets posted by Twitter users. We analyze 4 million Twitter messages related to the COVID-19 pandemic using a list of 25 hashtags such as coronavirus, COVID-19, quarantine from March 1 to April 21 in 2020. We use a machine learning approach, Latent Dirichlet Allocation (LDA), to identify popular unigram, bigrams, salient topics and themes, and sentiments in the collected Tweets. Popular unigrams include virus, lockdown, and quarantine. Popular bigrams include COVID-19, stay home, corona virus, social distancing, and new cases. We identify 13 discussion topics and categorize them into five different themes, such as public health measures to slow the spread of COVID-19, social stigma associated with COVID-19, coronavirus news cases and deaths, COVID-19 in the United States, and coronavirus cases in the rest of the world. Across all identified topics, the dominant sentiments for the spread of coronavirus are anticipation that measures that can be taken, followed by a mixed feeling of trust, anger, and fear for different topics. The public reveals a significant feeling of fear when they discuss the coronavirus new cases and deaths than other topics. The study shows that Twitter data and machine learning approaches can be leveraged for infodemiology study by studying the evolving public discussions and sentiments during the COVID-19. Real-time monitoring and assessment of the Twitter discussion and concerns can be promising for public health emergency responses and planning. Already emerged pandemic fear, stigma, and mental health concerns may continue to influence public trust when there occurs a second wave of COVID-19 or a new surge of the imminent pandemic.
Online social media provides a channel for monitoring peoples social behaviors and their mental distress. Due to the restrictions imposed by COVID-19 people are increasingly using online social networks to express their feelings. Consequently, there is a significant amount of diverse user-generated social media content. However, COVID-19 pandemic has changed the way we live, study, socialize and recreate and this has affected our well-being and mental health problems. There are growing researches that leverage online social media analysis to detect and assess users mental status. In this paper, we survey the literature of social media analysis for mental disorders detection, with a special focus on the studies conducted in the context of COVID-19 during 2020-2021. Firstly, we classify the surveyed studies in terms of feature extraction types, varying from language usage patterns to aesthetic preferences and online behaviors. Secondly, we explore detection methods used for mental disorders detection including machine learning and deep learning detection methods. Finally, we discuss the challenges of mental disorder detection using social media data, including the privacy and ethical concerns, as well as the technical challenges of scaling and deploying such systems at large scales, and discuss the learnt lessons over the last few years.