Thousands of different forms (words) are associated with thousands of different meanings (concepts) in a language computer model. Reasonable agreement with reality is found for the number of languages in a family and the Hamming distances between languages.
This paper presents Monte Carlo simulations of language populations and the development of language families, showing how a simple model can lead to distributions similar to the ones observed empirically. The model used combines features of two model
s used in earlier work by phycisists for the simulation of competition among languages: the Viviane model for the migration of people and propagation of languages and the Schulze model, which uses bitstrings as a way of characterising structural features of languages.
When a region is conquered by people speaking another language, we assume within the Schulze model that at each iteration each person with probability s shifts to the conquering language. The time needed for the conquering language to become dominati
ng is about 2/s for directed Barabasi-Albert networks, but diverges on the square lattice for decreasing s at some critical value sc
Previous work in the context of natural language querying of temporal databases has established a method to map automatically from a large subset of English time-related questions to suitable expressions of a temporal logic-like language, called TOP.
An algorithm to translate from TOP to the TSQL2 temporal database language has also been defined. This paper shows how TOP expressions could be translated into a simpler logic-like language, called BOT. BOT is very close to traditional first-order predicate logic (FOPL), and hence existing methods to manipulate FOPL expressions can be exploited to interface to time-sensitive applications other than TSQL2 databases, maintaining the existing English-to-TOP mapping.
Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever ``understand raw text without access to some form of grounding. W
e formally investigate the abilities of ungrounded systems to acquire meaning. Our analysis focuses on the role of ``assertions: textual contexts that provide indirect clues about the underlying semantics. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation of languages that satisfy a strong notion of semantic transparency. However, for classes of languages where the same expression can take different values in different contexts, we show that emulation can become uncomputable. Finally, we discuss differences between our formal model and natural language, exploring how our results generalize to a modal setting and other semantic relations. Together, our results suggest that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. We formalize ways in which ungrounded language models appear to be fundamentally limited in their ability to ``understand.
Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NI
SQ) computers for datasets of size >= 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke et al. (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train two NLP models that solve simple sentence classification tasks on quantum hardware. We describe in detail the main principles, the process and challenges of these experiments, in a way accessible to NLP researchers, thus paving the way for practical Quantum Natural Language Processing.