No Arabic abstract
We address the problem of maximizing user engagement with content (in the form of like, reply, retweet, and retweet with comments)on the Twitter platform. We formulate the engagement forecasting task as a multi-label classification problem that captures choice behavior on an unsupervised clustering of tweet-topics. We propose a neural network architecture that incorporates user engagement history and predicts choice conditional on this context. We study the impact of recommend-ing tweets on engagement outcomes by solving an appropriately defined sweet optimization problem based on the proposed model using a large dataset obtained from Twitter.
To reach a broader audience and optimize traffic toward news articles, media outlets commonly run social media accounts and share their content with a short text summary. Despite its importance of writing a compelling message in sharing articles, the research community does not own a sufficient understanding of what kinds of editing strategies effectively promote audience engagement. In this study, we aim to fill the gap by analyzing media outlets current practices using a data-driven approach. We first build a parallel corpus of original news articles and their corresponding tweets that eight media outlets shared. Then, we explore how those media edited tweets against original headlines and the effects of such changes. To estimate the effects of editing news headlines for social media sharing in audience engagement, we present a systematic analysis that incorporates a causal inference technique with deep learning; using propensity score matching, it allows for estimating potential (dis-)advantages of an editing style compared to counterfactual cases where a similar news article is shared with a different style. According to the analyses of various editing styles, we report common and differing effects of the styles across the outlets. To understand the effects of various editing styles, media outlets could apply our easy-to-use tool by themselves.
This paper presents a user modeling pipeline to analyze discussions and opinions shared on social media regarding polarized political events (e.g., public polls). The pipeline follows a four-step methodology. First, social media posts and users metadata are crawled. Second, a filtering mechanism is applied to filter spammers and bot users. As a third step, demographics information is extracted out of the valid users, namely gender, age, ethnicity and location information. Finally, the political polarity of the users with respect to the analyzed event is predicted. In the scope of this work, our proposed pipeline is applied to two referendum scenarios (independence of Catalonia in Spain and autonomy of Lombardy in Italy) in order to assess the performance of the approach with respect to the capability of collecting correct insights on the demographics of social media users and of predicting the poll results based on the opinions shared by the users. Experiments show that the method was effective in predicting the political trends for the Catalonia case, but not for the Lombardy case. Among the various motivations for this, we noticed that in general Twitter was more representative of the users opposing the referendum than the ones in favor.
The contagion dynamics can emerge in social networks when repeated activation is allowed. An interesting example of this phenomenon is retweet cascades where users allow to re-share content posted by other people with public accounts. To model this type of behaviour we use a Hawkes self-exciting process. To do it properly though one needs to calibrate model under consideration. The main goal of this paper is to construct moments method of estimation of this model. The key step is based on identifying of a generator of a Hawkes process. We perform numerical analysis on real data as well.
Human decision making underlies data generating process in multiple application areas, and models explaining and predicting choices made by individuals are in high demand. Discrete choice models are widely studied in economics and computational social sciences. As digital social networking facilitates information flow and spread of influence between individuals, new advances in modeling are needed to incorporate social information into these models in addition to characteristic features affecting individual choices. In this paper, we propose two novel models with scalable training algorithms: local logistics graph regularization (LLGR) and latent class graph regularization (LCGR) models. We add social regularization to represent similarity between friends, and we introduce latent classes to account for possible preference discrepancies between different social groups. Training of the LLGR model is performed using alternating direction method of multipliers (ADMM), and training of the LCGR model is performed using a specialized Monte Carlo expectation maximization (MCEM) algorithm. Scalability to large graphs is achieved by parallelizing computation in both the expectation and the maximization steps. The LCGR model is the first latent class classification model that incorporates social relationships among individuals represented by a given graph. To evaluate our two models, we consider three classes of data to illustrate a typical large-scale use case in internet and social media applications. We experiment on synthetic datasets to empirically explain when the proposed model is better than vanilla classification models that do not exploit graph structure. We also experiment on real-world data, including both small scale and large scale real-world datasets, to demonstrate on which types of datasets our model can be expected to outperform state-of-the-art models.
Working adults spend nearly one third of their daily time at their jobs. In this paper, we study job-related social media discourse from a community of users. We use both crowdsourcing and local expertise to train a classifier to detect job-related messages on Twitter. Additionally, we analyze the linguistic differences in a job-related corpus of tweets between individual users vs. commercial accounts. The volumes of job-related tweets from individual users indicate that people use Twitter with distinct monthly, daily, and hourly patterns. We further show that the moods associated with jobs, positive and negative, have unique diurnal rhythms.