Do you want to publish a course? Click here

Where in the World are You? Geolocation and Language Identification in Twitter

217   0   0.0 ( 0 )
 Added by Scott A. Hale
 Publication date 2013
and research's language is English




Ask ChatGPT about the research

The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets in order to understand important questions related to the 300 million tweets that flow through the platform each day. However, much of this work is carried out with only limited understandings of how best to work with the spatial and linguistic contexts in which the information was produced. Furthermore, standard, well-accepted practices have yet to emerge. As such, this paper studies the reliability of key methods used to determine language and location of content in Twitter. It compares three automated language identification packages to Twitters user interface language setting and to a human coding of languages in order to identify common sources of disagreement. The paper also demonstrates that in many cases user-entered profile locations differ from the physical locations users are actually tweeting from. As such, these open-ended, user-generated, profile locations cannot be used as useful proxies for the physical locations from which information is published to Twitter.



rate research

Read More

Most current approaches to characterize and detect hate speech focus on textit{content} posted in Online Social Networks. They face shortcomings to collect and annotate hateful speech due to the incompleteness and noisiness of OSN text and the subjectivity of hate speech. These limitations are often aided with constraints that oversimplify the problem, such as considering only tweets containing hate-related words. In this work we partially address these issues by shifting the focus towards textit{users}. We develop and employ a robust methodology to collect and annotate hateful users which does not depend directly on lexicon and where the users are annotated given their entire profile. This results in a sample of Twitters retweet graph containing $100,386$ users, out of which $4,972$ were annotated. We also collect the users who were banned in the three months that followed the data collection. We show that hateful users differ from normal ones in terms of their activity patterns, word usage and as well as network structure. We obtain similar results comparing the neighbors of hateful vs. neighbors of normal users and also suspended users vs. active users, increasing the robustness of our analysis. We observe that hateful users are densely connected, and thus formulate the hate speech detection problem as a task of semi-supervised learning over a graph, exploiting the network of connections on Twitter. We find that a node embedding algorithm, which exploits the graph structure, outperforms content-based approaches for the detection of both hateful ($95%$ AUC vs $88%$ AUC) and suspended users ($93%$ AUC vs $88%$ AUC). Altogether, we present a user-centric view of hate speech, paving the way for better detection and understanding of this relevant and challenging issue.
We present Where Are You? (WAY), a dataset of ~6k dialogs in which two humans -- an Observer and a Locator -- complete a cooperative localization task. The Observer is spawned at random in a 3D environment and can navigate from first-person views while answering questions from the Locator. The Locator must localize the Observer in a detailed top-down map by asking questions and giving instructions. Based on this dataset, we define three challenging tasks: Localization from Embodied Dialog or LED (localizing the Observer from dialog history), Embodied Visual Dialog (modeling the Observer), and Cooperative Localization (modeling both agents). In this paper, we focus on the LED task -- providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices. Our best model achieves 32.7% success at identifying the Observers location within 3m in unseen buildings, vs. 70.4% for human Locators.
213 - Angelo Tartaglia 2012
This talk discusses various aspects of the structure of space-time presenting mechanisms leading to the explanation of the rigidity of the manifold and to the emergence of time, i.e. of the Lorentzian signature. The proposed ingredient is the analog, in four dimensions, of the deformation energy associated with the common threedimensional elasticity theory. The inclusion of this additional term in the Lagrangian of empty space-time accounts for gravity as an emergent feature from the microscopic structure of space-time. Once time has legitimately been introduced, a global positioning method based on local measurements of proper times between the arrivals of electromagnetic pulses from independent distant sources is presented. The method considers both pulsars as well as artificial emitters located on celestial bodies of the solar system as pulsating beacons to be used for navigation and positioning.
The impact of online social media on societal events and institutions is profound; and with the rapid increases in user uptake, we are just starting to understand its ramifications. Social scientists and practitioners who model online discourse as a proxy for real-world behavior, often curate large social media datasets. A lack of available tooling aimed at non-data science experts frequently leaves this data (and the insights it holds) underutilized. Here, we propose birdspotter -- a tool to analyze and label Twitter users --, and birdspotter.ml -- an exploratory visualizer for the computed metrics. birdspotter provides an end-to-end analysis pipeline, from the processing of pre-collected Twitter data, to general-purpose labeling of users, and estimating their social influence, within a few lines of code. The package features tutorials and detailed documentation. We also illustrate how to train birdspotter into a fully-fledged bot detector that achieves better than state-of-the-art performances without making any Twitter API online calls, and we showcase its usage in an exploratory analysis of a topical COVID-19 dataset.
Disruptions resulting from an epidemic might often appear to amount to chaos but, in reality, can be understood in a systematic way through the lens of epidemic psychology. According to Philip Strong, the founder of the sociological study of epidemic infectious diseases, not only is an epidemic biological; there is also the potential for three psycho-social epidemics: of fear, moralization, and action. This work empirically tests Strongs model at scale by studying the use of language of 122M tweets related to the COVID-19 pandemic posted in the U.S. during the whole year of 2020. On Twitter, we identified three distinct phases. Each of them is characterized by different regimes of the three psycho-social epidemics. In the refusal phase, users refused to accept reality despite the increasing number of deaths in other countries. In the anger phase (started after the announcement of the first death in the country), users fear translated into anger about the looming feeling that things were about to change. Finally, in the acceptance phase, which began after the authorities imposed physical-distancing measures, users settled into a new normal for their daily activities. Overall, refusal of accepting reality gradually died off as the year went on, while acceptance increasingly took hold. During 2020, as cases surged in waves, so did anger, re-emerging cyclically at each wave. Our real-time operationalization of Strongs model is designed in a way that makes it possible to embed epidemic psychology into real-time models (e.g., epidemiological and mobility models).
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا