No Arabic abstract
Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica.
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.
The complexity of matrix multiplication (hereafter MM) has been intensively studied since 1969, when Strassen surprisingly decreased the exponent 3 in the cubic cost of the straightforward classical MM to log 2 (7) $approx$ 2.8074. Applications to some fundamental problems of Linear Algebra and Computer Science have been immediately recognized, but the researchers in Computer Algebra keep discovering more and more applications even today, with no sign of slowdown. We survey the unfinished history of decreasing the exponent towards its information lower bound 2, recall some important techniques discovered in this process and linked to other fields of computing, reveal sample surprising applications to fast computation of the inner products of two vectors and summation of integers, and discuss the curse of recursion, which separates the progress in fast MM into its most acclaimed and purely theoretical part and into valuable acceleration of MM of feasible sizes. Then, in the second part of our paper, we cover fast MM in realistic symbolic computations and discuss applications and implementation of fast exact matrix multiplication. We first review how most of exact linear algebra can be reduced to matrix multiplication over small finite fields. Then we highlight the differences in the design of approximate and exact implementations of fast MM, taking into account nowadays processor and memory hierarchies. In the concluding section we comment on current perspectives of the study of fast MM.
Hyperparameter optimization in machine learning (ML) deals with the problem of empirically learning an optimal algorithm configuration from data, usually formulated as a black-box optimization problem. In this work, we propose a zero-shot method to meta-learn symbolic default hyperparameter configurations that are expressed in terms of the properties of the dataset. This enables a much faster, but still data-dependent, configuration of the ML algorithm, compared to standard hyperparameter optimization approaches. In the past, symbolic and static default values have usually been obtained as hand-crafted heuristics. We propose an approach of learning such symbolic configurations as formulas of dataset properties from a large set of prior evaluations on multiple datasets by optimizing over a grammar of expressions using an evolutionary algorithm. We evaluate our method on surrogate empirical performance models as well as on real data across 6 ML algorithms on more than 100 datasets and demonstrate that our method indeed finds viable symbolic defaults.
We give a brief introduction to FORM, a symbolic programming language for massive batch operations, designed by J.A.M. Vermaseren. In particular, we stress various methods to efficiently use FORM under the UNIX operating system. Several scripts and examples are given, and suggestions on how to use the vim editor as development platform.
In this note, I develop my personal view on the scope and relevance of symbolic computation in software science. For this, I discuss the interaction and differences between symbolic computation, software science, automatic programming, mathematical knowledge management, artificial intelligence, algorithmic intelligence, numerical computation, and machine learning. In the discussion of these notions, I allow myself to refer also to papers (1982, 1985, 2001, 2003, 2013) of mine in which I expressed my views on these areas at early stages of some of these fields.