Do you want to publish a course? Click here

Representing Formal Languages: A Comparison Between Finite Automata and Recurrent Neural Networks

68   0   0.0 ( 0 )
 Added by Joshua Michalenko
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

We investigate the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language. Specifically, we train a RNN on positive and negative examples from a regular language, and ask if there is a simple decoding function that maps states of this RNN to states of the minimal deterministic finite automaton (MDFA) for the language. Our experiments show that such a decoding function indeed exists, and that it maps states of the RNN not to MDFA states, but to states of an {em abstraction} obtained by clustering small sets of MDFA states into superstates. A qualitative analysis reveals that the abstraction often has a simple interpretation. Overall, the results suggest a strong structural relationship between internal representations used by RNNs and finite automata, and explain the well-known ability of RNNs to recognize formal grammatical structure.



rate research

Read More

In this paper, we present connections between three models used in different research fields: weighted finite automata~(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices. We then unravel a fundamental connection between WFA and second-order recurrent neural networks~(2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activation functions are expressively equivalent. Furthermore, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.
This paper is an attempt to bridge the gap between deep learning and grammatical inference. Indeed, it provides an algorithm to extract a (stochastic) formal language from any recurrent neural network trained for language modelling. In detail, the algorithm uses the already trained network as an oracle -- and thus does not require the access to the inner representation of the black-box -- and applies a spectral approach to infer a weighted automaton. As weighted automata compute linear functions, they are computationally more efficient than neural networks and thus the nature of the approach is the one of knowledge distillation. We detail experiments on 62 data sets (both synthetic and from real-world applications) that allow an in-depth study of the abilities of the proposed algorithm. The results show the WA we extract are good approximations of the RNN, validating the approach. Moreover, we show how the process provides interesting insights toward the behavior of RNN learned on data, enlarging the scope of this work to the one of explainability of deep learning models.
112 - Hans Raj Tiwary 2016
In this article we undertake a study of extension complexity from the perspective of formal languages. We define a natural way to associate a family of polytopes with binary languages. This allows us to define the notion of extension complexity of formal languages. We prove several closure properties of languages admitting compact extended formulations. Furthermore, we give a sufficient machine characterization of compact languages. We demonstrate the utility of this machine characterization by obtaining upper bounds for polytopes for problems in nondeterministic logspace; lower bounds in streaming models; and upper bounds on extension complexities of several polytopes.
This volume contains the proceedings of the 11th International Symposium on Games, Automata, Logic and Formal Verification (GandALF 2020). The symposium took place as a fully online event on September 21-22, 2020. The GandALF symposium was established by a group of Italian computer scientists interested in mathematical logic, automata theory, game theory, and their applications to the specification, design, and verification of complex systems. Its aim is to provide a forum where people from different areas, and possibly with different backgrounds, can fruitfully interact. GandALF has a truly international spirit, as witnessed by the composition of the program and steering committee and by the country distribution of the submitted papers.
In this paper we present a novel approach to automatically infer parameters of spiking neural networks. Neurons are modelled as timed automata waiting for inputs on a number of different channels (synapses), for a given amount of time (the accumulation period). When this period is over, the current potential value is computed considering current and past inputs. If this potential overcomes a given threshold, the automaton emits a broadcast signal over its output channel , otherwise it restarts another accumulation period. After each emission, the automaton remains inactive for a fixed refractory period. Spiking neural networks are formalised as sets of automata, one for each neuron, running in parallel and sharing channels according to the network structure. Such a model is formally validated against some crucial properties defined via proper temporal logic formulae. The model is then exploited to find an assignment for the synaptical weights of neural networks such that they can reproduce a given behaviour. The core of this approach consists in identifying some correcting actions adjusting synaptical weights and back-propagating them until the expected behaviour is displayed. A concrete case study is discussed.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا