No Arabic abstract
In October 2017, numerous women accused producer Harvey Weinstein of sexual harassment. Their stories encouraged other women to voice allegations of sexual harassment against many high profile men, including politicians, actors, and producers. These events are broadly referred to as the #MeToo movement, named for the use of the hashtag #metoo on social media platforms like Twitter and Facebook. The movement has widely been referred to as empowering because it has amplified the voices of previously unheard women over those of traditionally powerful men. In this work, we investigate dynamics of sentiment, power and agency in online media coverage of these events. Using a corpus of online media articles about the #MeToo movement, we present a contextual affective analysis---an entity-centric approach that uses contextualized lexicons to examine how people are portrayed in media articles. We show that while these articles are sympathetic towards women who have experienced sexual harassment, they consistently present men as most powerful, even after sexual assault allegations. While we focus on media coverage of the #MeToo movement, our method for contextual affective analysis readily generalizes to other domains.
Specific lexical choices in narrative text reflect both the writers attitudes towards people in the narrative and influence the audiences reactions. Prior work has examined descriptions of people in English using contextual affective analysis, a natural language processing (NLP) technique that seeks to analyze how people are portrayed along dimensions of power, agency, and sentiment. Our work presents an extension of this methodology to multilingual settings, which is enabled by a new corpus that we collect and a new multilingual model. We additionally show how word connotations differ across languages and cultures, highlighting the difficulty of generalizing existing English datasets and methods. We then demonstrate the usefulness of our method by analyzing Wikipedia biography pages of members of the LGBT community across three languages: English, Russian, and Spanish. Our results show systematic differences in how the LGBT community is portrayed across languages, surfacing cultural differences in narratives and signs of social biases. Practically, this model can be used to identify Wikipedia articles for further manual analysis -- articles that might contain content gaps or an imbalanced representation of particular social groups.
While contextualized word representations have improved state-of-the-art benchmarks in many NLP tasks, their potential usefulness for social-oriented tasks remains largely unexplored. We show how contextualized word embeddings can be used to capture affect dimensions in portrayals of people. We evaluate our methodology quantitatively, on held-out affect lexicons, and qualitatively, through case examples. We find that contextualized word representations do encode meaningful affect information, but they are heavily biased towards their training data, which limits their usefulness to in-domain analyses. We ultimately use our method to examine differences in portrayals of men and women.
Coronavirus outbreak is one of the most challenging pandemics for the entire human population of the planet Earth. Techniques such as the isolation of infected persons and maintaining social distancing are the only preventive measures against the epidemic COVID-19. The actual estimation of the number of infected persons with limited data is an indeterminate problem faced by data scientists. There are a large number of techniques in the existing literature, including reproduction number, the case fatality rate, etc., for predicting the duration of an epidemic and infectious population. This paper presents a case study of different techniques for analysing, modeling, and representation of data associated with an epidemic such as COVID-19. We further propose an algorithm for estimating infection transmission states in a particular area. This work also presents an algorithm for estimating end-time of an epidemic from Susceptible Infectious and Recovered model. Finally, this paper presents empirical and data analysis to study the impact of transmission probability, rate of contact, infectious, and susceptible on the epidemic spread.
A growing number of empirical studies suggest that negative advertising is effective in campaigning, while the mechanisms are rarely mentioned. With the scandal of Cambridge Analytica and Russian intervention behind the Brexit and the 2016 presidential election, people have become aware of the political ads on social media and have pressured congress to restrict political advertising on social media. Following the related legislation, social media companies began disclosing their political ads archive for transparency during the summer of 2018 when the midterm election campaign was just beginning. This research collects the data of the related political ads in the context of the U.S. midterm elections since August to study the overall pattern of political ads on social media and uses sets of machine learning methods to conduct sentiment analysis on these ads to classify the negative ads. A novel approach is applied that uses AI image recognition to study the image data. Through data visualization, this research shows that negative advertising is still the minority, Republican advertisers and third party organizations are more likely to engage in negative advertising than their counterparts. Based on ordinal regressions, this study finds that anger evoked information-seeking is one of the main mechanisms causing negative ads to be more engaging and effective rather than the negative bias theory. Overall, this study provides a unique understanding of political advertising on social media by applying innovative data science methods. Further studies can extend the findings, methods, and datasets in this study, and several suggestions are given for future research.
Many applications like pointer analysis and incremental compilation require maintaining a topological ordering of the nodes of a directed acyclic graph (DAG) under dynamic updates. All known algorithms for this problem are either only analyzed for worst-case insertion sequences or only evaluated experimentally on random DAGs. We present the first average-case analysis of online topological ordering algorithms. We prove an expected runtime of O(n^2 polylog(n)) under insertion of the edges of a complete DAG in a random order for the algorithms of Alpern et al. (SODA, 1990), Katriel and Bodlaender (TALG, 2006), and Pearce and Kelly (JEA, 2006). This is much less than the best known worst-case bound O(n^{2.75}) for this problem.