No Arabic abstract
Tags assigned by users to shared content can be ambiguous. As a possible solution, we propose semantic tagging as a collaborative process in which a user selects and associates Web resources drawn from a knowledge context. We applied this general technique in the specific context of online historical maps and allowed users to annotate and tag them. To study the effects of semantic tagging on tag production, the types and categories of obtained tags, and user task load, we conducted an in-lab within-subject experiment with 24 participants who annotated and tagged two distinct maps. We found that the semantic tagging implementation does not affect these parameters, while providing tagging relationships to well-defined concept definitions. Compared to label-based tagging, our technique also gathers positive and negative tagging relationships. We believe that our findings carry implications for designers who want to adopt semantic tagging in other contexts and systems on the Web.
In this work, we present a new semantic segmentation model for historical city maps that surpasses the state of the art in terms of flexibility and performance. Research in automatic map processing is largely focused on homogeneous corpora or even individual maps, leading to inflexible algorithms. Recently, convolutional neural networks have opened new perspectives for the development of more generic tools. Based on two new maps corpora, the first one centered on Paris and the second one gathering cities from all over the world, we propose a method for operationalizing the figuration based on traditional computer vision algorithms that allows large-scale quantitative analysis. In a second step, we propose a semantic segmentation model based on neural networks and implement several improvements. Finally, we analyze the impact of map figuration on segmentation performance and evaluate future ways to improve the representational flexibility of neural networks. To conclude, we show that these networks are able to semantically segment map data of a very large figurative diversity with efficiency.
Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
This paper presents a use case exploring the application of the Archival Resource Key (ARK) persistent identifier for promoting and maintaining ontologies. In particular, we look at improving computation with an in-house ontology server in the context of temporally aligned vocabularies. This effort demonstrates the utility of ARKs in preparing historical ontologies for computational archival science.
Although neural sequence-to-sequence models have been successfully applied to semantic parsing, they fail at compositional generalization, i.e., they are unable to systematically generalize to unseen compositions of seen components. Motivated by traditional semantic parsing where compositionality is explicitly accounted for by symbolic grammars, we propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models while featuring lexicon-style alignments and disentangled information processing. Specifically, we decompose decoding into two phases where an input utterance is first tagged with semantic symbols representing the meaning of individual words, and then a sequence-to-sequence model is used to predict the final meaning representation conditioning on the utterance and the predicted tag sequence. Experimental results on three semantic parsing datasets show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
There is extensive, yet fragmented, evidence of gender differences in academia suggesting that women are under-represented in most scientific disciplines, publish fewer articles throughout a career, and their work acquires fewer citations. Here, we offer a comprehensive picture of longitudinal gender discrepancies in performance through a bibliometric analysis of academic careers by reconstructing the complete publication history of over 1.5 million gender-identified authors whose publishing career ended between 1955 and 2010, covering 83 countries and 13 disciplines. We find that, paradoxically, the increase of participation of women in science over the past 60 years was accompanied by an increase of gender differences in both productivity and impact. Most surprisingly though, we uncover two gender invariants, finding that men and women publish at a comparable annual rate and have equivalent career-wise impact for the same size body of work. Finally, we demonstrate that differences in dropout rates and career length explain a large portion of the reported career-wise differences in productivity and impact. This comprehensive picture of gender inequality in academia can help rephrase the conversation around the sustainability of womens careers in academia, with important consequences for institutions and policy makers.