Neural Arithmetic Expression Calculator

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kaiyu Chen

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Kaiyu Chen - Yihan Dong - Xipeng Qiu

الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a pure neural solver for arithmetic expression calculation (AEC) problem. Previous work utilizes the powerful capabilities of deep neural networks and attempts to build an end-to-end model to solve this problem. However, most of these methods can only deal with the additive operations. It is still a challenging problem to solve the complex expression calculation problem, which includes the adding, subtracting, multiplying, dividing and bracketing operations. In this work, we regard the arithmetic expression calculation as a hierarchical reinforcement learning problem. An arithmetic operation is decomposed into a series of sub-tasks, and each sub-task is dealt with by a skill module. The skill module could be a basic module performing elementary operations, or interactive module performing complex operations by invoking other skill models. With curriculum learning, our model can deal with a complex arithmetic expression calculation with the deep hierarchical structure of skill models. Experiments show that our model significantly outperforms the previous models for arithmetic expression calculation.

قيم البحث

اقرأ أيضاً

Arithmetic Expression Construction

136 - Leo Alcock , Sualeh Asif , Jeffrey Bosboom 2020

When can $n$ given numbers be combined using arithmetic operators from a given subset of ${+, -, times, div}$ to obtain a given target number? We study three variations of this problem of Arithmetic Expression Construction: when the expression (1) is unconstrained; (2) has a specified pattern of parentheses and operators (and only the numbers need to be assigned to blanks); or (3) must match a specified ordering of the numbers (but the operators and parenthesization are free). For each of these variants, and many of the subsets of ${+,-,times,div}$, we prove the problem NP-complete, sometimes in the weak sense and sometimes in the strong sense. Most of these proofs make use of a rational function framework which proves equivalence of these problems for values in rational functions with values in positive integers.

التعقيد الحسابي

Neural Logic Networks

124 - Shaoyun Shi , Hanxiong Chen , Min Zhang 2019

Recent years have witnessed the great success of deep neural networks in many research areas. The fundamental idea behind the design of most neural networks is to learn similarity patterns from data for prediction and inference, which lacks the abili ty of logical reasoning. However, the concrete ability of logical reasoning is critical to many theoretical and practical problems. In this paper, we propose Neural Logic Network (NLN), which is a dynamic neural architecture that builds the computational graph according to input logical expressions. It learns basic logical operations as neural modules, and conducts propositional logical reasoning through the network for inference. Experiments on simulated data show that NLN achieves significant performance on solving logical equations. Further experiments on real-world data show that NLN significantly outperforms state-of-the-art models on collaborative filtering and personalized recommendation tasks.

الذكاء الاصطناعي التعلم الآلي الحوسبة العصبية والتطورية

Neural Production Systems

120 - Anirudh Goyal , Aniket Didolkar , Nan Rosemary Ke 2021

Visual environments are structured, consisting of distinct objects or entities. These entities have properties -- both visible and latent -- that determine the manner in which they interact with one another. To partition images into entities, deep-le arning researchers have proposed structural inductive biases such as slot-based architectures. To model interactions among entities, equivariant graph neural nets (GNNs) are used, but these are not particularly well suited to the task for two reasons. First, GNNs do not predispose interactions to be sparse, as relationships among independent entities are likely to be. Second, GNNs do not factorize knowledge about interactions in an entity-conditional manner. As an alternative, we take inspiration from cognitive science and resurrect a classic approach, production systems, which consist of a set of rule templates that are applied by binding placeholder variables in the rules to specific entities. Rules are scored on their match to entities, and the best fitting rules are applied to update entity properties. In a series of experiments, we demonstrate that this architecture achieves a flexible, dynamic flow of control and serves to factorize entity-specific and rule-based information. This disentangling of knowledge achieves robust future-state prediction in rich visual environments, outperforming state-of-the-art methods using GNNs, and allows for the extrapolation from simple (few object) environments to more complex environments.

الذكاء الاصطناعي التعلم الآلي التعلم الالي

Neural Coarse-Graining: Extracting slowly-varying latent degrees of freedom with neural networks

63 - Nicholas Guttenberg , Martin Biehl , Ryota Kanai 2016

We present a loss function for neural networks that encompasses an idea of trivial versus non-trivial predictions, such that the network jointly determines its own prediction goals and learns to satisfy them. This permits the network to choose sub-se ts of a problem which are most amenable to its abilities to focus on solving, while discarding distracting elements that interfere with its learning. To do this, the network first transforms the raw data into a higher-level categorical representation, and then trains a predictor from that new time series to its future. To prevent a trivial solution of mapping the signal to zero, we introduce a measure of non-triviality via a contrast between the prediction error of the learned model with a naive model of the overall signal statistics. The transform can learn to discard uninformative and unpredictable components of the signal in favor of the features which are both highly predictive and highly predictable. This creates a coarse-grained model of the time-series dynamics, focusing on predicting the slowly varying latent parameters which control the statistics of the time-series, rather than predicting the fast details directly. The result is a semi-supervised algorithm which is capable of extracting latent parameters, segmenting sections of time-series with differing statistics, and building a higher-level representation of the underlying dynamics from unlabeled data.

الذكاء الاصطناعي التعلم الآلي التعلم الالي

Neural Dynamic Programming for Musical Self Similarity

81 - Christian J. Walder , Dongwoo Kim 2018

We present a neural sequence model designed specifically for symbolic music. The model is based on a learned edit distance mechanism which generalises a classic recursion from computer sci- ence, leading to a neural dynamic program. Re- peated motifs are detected by learning the transfor- mations between them. We represent the arising computational dependencies using a novel data structure, the edit tree; this perspective suggests natural approximations which afford the scaling up of our otherwise cubic time algorithm. We demonstrate our model on real and synthetic data; in all cases it out-performs a strong stacked long short-term memory benchmark.

الذكاء الاصطناعي التعلم الآلي