No Arabic abstract
Nearly all previous work on geo-locating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs. Activities, such as alcohol consumption, may occur at different places and types of places, and it is important not only to detect the local regions where these activities occur, but also to analyze the degree of participation in them by local residents. In this paper, we develop new machine learning based methods for fine-grained localization of activities and home locations from Twitter data. We apply these methods to discover and compare alcohol consumption patterns in a large urban area, New York City, and a more suburban and rural area, Monroe County. We find positive correlations between the rate of alcohol consumption reported among a communitys Twitter users and the density of alcohol outlets, demonstrating that the degree of correlation varies significantly between urban and suburban areas. While our experiments are focused on alcohol use, our methods for locating homes and distinguishing temporally-specific self-reports are applicable to a broad range of behaviors and latent states.
We study the extent to which we can infer users geographical locations from social media. Location inference from social media can benefit many applications, such as disaster management, targeted advertising, and news content tailoring. The challenges, however, lie in the limited amount of labeled data and the large scale of social networks. In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM). The model provides a probabilistic framework in which various sources of information (e.g., content and social network) can be combined together. We design a two-layer neural network to learn feature representations, and incorporate the learned latent features into SSFGM. To deal with the large-scale problem, we propose a Two-Chain Sampling (TCS) algorithm to learn SSFGM. The algorithm achieves a good trade-off between accuracy and efficiency. Experiments on Twitter and Weibo show that the proposed TCS algorithm for SSFGM can substantially improve the inference accuracy over several state-of-the-art methods. More importantly, TCS achieves over 100x speedup comparing with traditional propagation-based methods (e.g., loopy belief propagation).
Over the past two decades, school shootings within the United States have repeatedly devastated communities and shaken public opinion. Many of these attacks appear to be `lone wolf ones driven by specific individual motivations, and the identification of precursor signals and hence actionable policy measures would thus seem highly unlikely. Here, we take a system-wide view and investigate the timing of school attacks and the dynamical feedback with social media. We identify a trend divergence in which college attacks have continued to accelerate over the last 25 years while those carried out on K-12 schools have slowed down. We establish the copycat effect in school shootings and uncover a statistical association between social media chatter and the probability of an attack in the following days. While hinting at causality, this relationship may also help mitigate the frequency and intensity of future attacks.
Urban flow monitoring systems play important roles in smart city efforts around the world. However, the ubiquitous deployment of monitoring devices, such as CCTVs, induces a long-lasting and enormous cost for maintenance and operation. This suggests the need for a technology that can reduce the number of deployed devices, while preventing the degeneration of data accuracy and granularity. In this paper, we aim to infer the real-time and fine-grained crowd flows throughout a city based on coarse-grained observations. This task is challenging due to two reasons: the spatial correlations between coarse- and fine-grained urban flows, and the complexities of external impacts. To tackle these issues, we develop a method entitled UrbanFM based on deep neural networks. Our model consists of two major parts: 1) an inference network to generate fine-grained flow distributions from coarse-grained inputs by using a feature extraction module and a novel distributional upsampling module; 2) a general fusion subnet to further boost the performance by considering the influences of different external factors. Extensive experiments on two real-world datasets, namely TaxiBJ and HappyValley, validate the effectiveness and efficiency of our method compared to seven baselines, demonstrating the state-of-the-art performance of our approach on the fine-grained urban flow inference problem.
Image completion has achieved significant progress due to advances in generative adversarial networks (GANs). Albeit natural-looking, the synthesized contents still lack details, especially for scenes with complex structures or images with large holes. This is because there exists a gap between low-level reconstruction loss and high-level adversarial loss. To address this issue, we introduce a perceptual network to provide mid-level guidance, which measures the semantical similarity between the synthesized and original contents in a similarity-enhanced space. We conduct a detailed analysis on the effects of different losses and different levels of perceptual features in image completion, showing that there exist complementarity between adversarial training and perceptual features. By combining them together, our model can achieve nearly seamless fusion results in an end-to-end manner. Moreover, we design an effective lightweight generator architecture, which can achieve effective image inpainting with far less parameters. Evaluated on CelebA Face and Paris StreetView dataset, our proposed method significantly outperforms existing methods.
Companies and financial investors are paying increasing attention to social consciousness in developing their corporate strategies and making investment decisions to support a sustainable economy for the future. Public discussion on incidents and events -- controversies -- of companies can provide valuable insights on how well the company operates with regards to social consciousness and indicate the companys overall operational capability. However, there are challenges in evaluating the degree of a companys social consciousness and environmental sustainability due to the lack of systematic data. We introduce a system that utilizes Twitter data to detect and monitor controversial events and show their impact on market volatility. In our study, controversial events are identified from clustered tweets that share the same 5W terms and sentiment polarities of these clusters. Credible news links inside the event tweets are used to validate the truth of the event. A case study on the Starbucks Philadelphia arrests shows that this method can provide the desired functionality.