A Probabilistic Framework for Location Inference from Social Media


Abstract in English

We study the extent to which we can infer users geographical locations from social media. Location inference from social media can benefit many applications, such as disaster management, targeted advertising, and news content tailoring. The challenges, however, lie in the limited amount of labeled data and the large scale of social networks. In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM). The model provides a probabilistic framework in which various sources of information (e.g., content and social network) can be combined together. We design a two-layer neural network to learn feature representations, and incorporate the learned latent features into SSFGM. To deal with the large-scale problem, we propose a Two-Chain Sampling (TCS) algorithm to learn SSFGM. The algorithm achieves a good trade-off between accuracy and efficiency. Experiments on Twitter and Weibo show that the proposed TCS algorithm for SSFGM can substantially improve the inference accuracy over several state-of-the-art methods. More importantly, TCS achieves over 100x speedup comparing with traditional propagation-based methods (e.g., loopy belief propagation).

Download