Learning Languages with Decidable Hypotheses


Abstract in English

In language learning in the limit, the most common type of hypothesis is to give an enumerator for a language. This so-called $W$-index allows for naming arbitrary computably enumerable languages, with the drawback that even the membership problem is undecidable. In this paper we use a different system which allows for naming arbitrary decidable languages, namely programs for characteristic functions (called $C$-indices). These indices have the drawback that it is now not decidable whether a given hypothesis is even a legal $C$-index. In this first analysis of learning with $C$-indices, we give a structured account of the learning power of various restrictions employing $C$-indices, also when compared with $W$-indices. We establish a hierarchy of learning power depending on whether $C$-indices are required (a) on all outputs; (b) only on outputs relevant for the class to be learned and (c) only in the limit as final, correct hypotheses. Furthermore, all these settings are weaker than learning with $W$-indices (even when restricted to classes of computable languages). We analyze all these questions also in relation to the mode of data presentation. Finally, we also ask about the relation of semantic versus syntactic convergence and derive the map of pairwise relations for these two kinds of convergence coupled with various forms of data presentation.

Download