No Arabic abstract
The Ubiquitous nature of smartphones has significantly increased the use of social media platforms, such as Facebook, Twitter, TikTok, and LinkedIn, etc., among the public, government, and businesses. Facebook generated ~70 billion USD in 2019 in advertisement revenues alone, a ~27% increase from the previous year. Social media has also played a strong role in outbreaks of social protests responsible for political changes in different countries. As we can see from the above examples, social media plays a big role in business intelligence and international politics. In this paper, we present and discuss a high-level functional intelligence model (recipe) of Social Media Analysis (SMA). This model synthesizes the input data and uses operational intelligence to provide actionable recommendations. In addition, it also matches the synthesized function of the experiences and learning gained from the environment. The SMA model presented is independent of the application domain, and can be applied to different domains, such as Education, Healthcare and Government, etc. Finally, we also present some of the challenges faced by SMA and how the SMA model presented in this paper solves them.
Recent studies have shown that online users tend to select information adhering to their system of beliefs, ignore information that does not, and join groups - i.e., echo chambers - around a shared narrative. Although a quantitative methodology for their identification is still missing, the phenomenon of echo chambers is widely debated both at scientific and political level. To shed light on this issue, we introduce an operational definition of echo chambers and perform a massive comparative analysis on more than 1B pieces of contents produced by 1M users on four social media platforms: Facebook, Twitter, Reddit, and Gab. We infer the leaning of users about controversial topics - ranging from vaccines to abortion - and reconstruct their interaction networks by analyzing different features, such as shared links domain, followed pages, follower relationship and commented posts. Our method quantifies the existence of echo-chambers along two main dimensions: homophily in the interaction networks and bias in the information diffusion toward likely-minded peers. We find peculiar differences across social media. Indeed, while Facebook and Twitter present clear-cut echo chambers in all the observed dataset, Reddit and Gab do not. Finally, we test the role of the social media platform on news consumption by comparing Reddit and Facebook. Again, we find support for the hypothesis that platforms implementing news feed algorithms like Facebook may elicit the emergence of echo-chambers.
The outbreak of COVID-19 has transformed societies across the world as governments tackle the health, economic and social costs of the pandemic. It has also raised concerns about the spread of hateful language and prejudice online, especially hostility directed against East Asia. In this paper we report on the creation of a classifier that detects and categorizes social media posts from Twitter into four classes: Hostility against East Asia, Criticism of East Asia, Meta-discussions of East Asian prejudice and a neutral class. The classifier achieves an F1 score of 0.83 across all four classes. We provide our final model (coded in Python), as well as a new 20,000 tweet training dataset used to make the classifier, two analyses of hashtags associated with East Asian prejudice and the annotation codebook. The classifier can be implemented by other researchers, assisting with both online content moderation processes and further research into the dynamics, prevalence and impact of East Asian prejudice online during this global pandemic.
While most mortality rates have decreased in the US, maternal mortality has increased and is among the highest of any OECD nation. Extensive public health research is ongoing to better understand the characteristics of communities with relatively high or low rates. In this work, we explore the role that social media language can play in providing insights into such community characteristics. Analyzing pregnancy-related tweets generated in US counties, we reveal a diverse set of latent topics including Morning Sickness, Celebrity Pregnancies, and Abortion Rights. We find that rates of mentioning these topics on Twitter predicts maternal mortality rates with higher accuracy than standard socioeconomic and risk variables such as income, race, and access to health-care, holding even after reducing the analysis to six topics chosen for their interpretability and connections to known risk factors. We then investigate psychological dimensions of community language, finding the use of less trustful, more stressed, and more negative affective language is significantly associated with higher mortality rates, while trust and negative affect also explain a significant portion of racial disparities in maternal mortality. We discuss the potential for these insights to inform actionable health interventions at the community-level.
We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the caloric content of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of caloric input, caloric output, and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities, is tunable to specific health and well-being measures such as diabetes rates, has the capability of providing a real-time signal reflecting a populations health, and has the potential to be used alongside traditional survey data in the development of public policy and collective self-awareness. Because our Lexicocalorimeter is a linear superposition of principled phrase scores, we also show we can move beyond correlations to explore what people talk about in collective detail, and assist in the understanding and explanation of how population-scale conditions vary, a capacity unavailable to black-box type methods.
Multimedia content in social media platforms provides significant information during disaster events. The types of information shared include reports of injured or deceased people, infrastructure damage, and missing or found people, among others. Although many studies have shown the usefulness of both text and image content for disaster response purposes, the research has been mostly focused on analyzing only the text modality in the past. In this paper, we propose to use both text and image modalities of social media data to learn a joint representation using state-of-the-art deep learning techniques. Specifically, we utilize convolutional neural networks to define a multimodal deep learning architecture with a modality-agnostic shared representation. Extensive experiments on real-world disaster datasets show that the proposed multimodal architecture yields better performance than models trained using a single modality (e.g., either text or image).