No Arabic abstract
Shaped by human movement, place connectivity is quantified by the strength of spatial interactions among locations. For decades, spatial scientists have researched place connectivity, applications, and metrics. The growing popularity of social media provides a new data stream where spatial social interaction measures are largely devoid of privacy issues, easily assessable, and harmonized. In this study, we introduced a global multi-scale place connectivity index (PCI) based on spatial interactions among places revealed by geotagged tweets as a spatiotemporal-continuous and easy-to-implement measurement. The multi-scale PCI, demonstrated at the US county level, exhibits a strong positive association with SafeGraph population movement records (10 percent penetration in the US population) and Facebooks social connectedness index (SCI), a popular connectivity index based on social networks. We found that PCI has a strong boundary effect and that it generally follows the distance decay, although this force is weaker in more urbanized counties with a denser population. Our investigation further suggests that PCI has great potential in addressing real-world problems that require place connectivity knowledge, exemplified with two applications: 1) modeling the spatial spread of COVID-19 during the early stage of the pandemic and 2) modeling hurricane evacuation destination choice. The methodological and contextual knowledge of PCI, together with the launched visualization platform and open-sourced PCI datasets at various geographic levels, are expected to support research fields requiring knowledge in human spatial interactions.
A key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.
In this paper, we propose a new measure to estimate the similarity between brands via posts of brands followers on social network services (SNS). Our method was developed with the intention of exploring the brands that customers are likely to jointly purchase. Nowadays, brands use social media for targeted advertising because influencing users preferences can greatly affect the trends in sales. We assume that data on SNS allows us to make quantitative comparisons between brands. Our proposed algorithm analyzes the daily photos and hashtags posted by each brands followers. By clustering them and converting them to histograms, we can calculate the similarity between brands. We evaluated our proposed algorithm with purchase logs, credit card information, and answers to the questionnaires. The experimental results show that the purchase data maintained by a mall or a credit card company can predict the co-purchase very well, but not the customers willingness to buy products of new brands. On the other hand, our method can predict the users interest on brands with a correlation value over 0.53, which is pretty high considering that such interest to brands are high subjective and individual dependent.
How people connect with one another is a fundamental question in the social sciences, and the resulting social networks can have a profound impact on our daily lives. Blau offered a powerful explanation: people connect with one another based on their positions in a social space. Yet a principled measure of social distance, allowing comparison within and between societies, remains elusive. We use the connectivity kernel of conditionally-independent edge models to develop a family of segregation statistics with desirable properties: they offer an intuitive and universal characteristic scale on social space (facilitating comparison across datasets and societies), are applicable to multivariate and mixed node attributes, and capture segregation at the level of individuals, pairs of individuals, and society as a whole. We show that the segregation statistics can induce a metric on Blau space (a space spanned by the attributes of the members of society) and provide maps of two societies. Under a Bayesian paradigm, we infer the parameters of the connectivity kernel from eleven ego-network datasets collected in four surveys in the United Kingdom and United States. The importance of different dimensions of Blau space is similar across time and location, suggesting a macroscopically stable social fabric. Physical separation and age differences have the most significant impact on segregation within friendship networks with implications for intergenerational mixing and isolation in later stages of life.
Sleep condition is closely related to an individuals health. Poor sleep conditions such as sleep disorder and sleep deprivation affect ones daily performance, and may also cause many chronic diseases. Many efforts have been devoted to monitoring peoples sleep conditions. However, traditional methodologies require sophisticated equipment and consume a significant amount of time. In this paper, we attempt to develop a novel way to predict individuals sleep condition via scrutinizing facial cues as doctors would. Rather than measuring the sleep condition directly, we measure the sleep-deprived fatigue which indirectly reflects the sleep condition. Our method can predict a sleep-deprived fatigue rate based on a selfie provided by a subject. This rate is used to indicate the sleep condition. To gain deeper insights of human sleep conditions, we collected around 100,000 faces from selfies posted on Twitter and Instagram, and identified their age, gender, and race using automatic algorithms. Next, we investigated the sleep condition distributions with respect to age, gender, and race. Our study suggests among the age groups, fatigue percentage of the 0-20 youth and adolescent group is the highest, implying that poor sleep condition is more prevalent in this age group. For gender, the fatigue percentage of females is higher than that of males, implying that more females are suffering from sleep issues than males. Among ethnic groups, the fatigue percentage in Caucasian is the highest followed by Asian and African American.
The COVID-19 pandemic has affected peoples lives around the world on an unprecedented scale. We intend to investigate hoarding behaviors in response to the pandemic using large-scale social media data. First, we collect hoarding-related tweets shortly after the outbreak of the coronavirus. Next, we analyze the hoarding and anti-hoarding patterns of over 42,000 unique Twitter users in the United States from March 1 to April 30, 2020, and dissect the hoarding-related tweets by age, gender, and geographic location. We find the percentage of females in both hoarding and anti-hoarding groups is higher than that of the general Twitter user population. Furthermore, using topic modeling, we investigate the opinions expressed towards the hoarding behavior by categorizing these topics according to demographic and geographic groups. We also calculate the anxiety scores for the hoarding and anti-hoarding related tweets using a lexical approach. By comparing their anxiety scores with the baseline Twitter anxiety score, we reveal further insights. The LIWC anxiety mean for the hoarding-related tweets is significantly higher than the baseline Twitter anxiety mean. Interestingly, beer has the highest calculated anxiety score compared to other hoarded items mentioned in the tweets.