Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

DuoRAT: Towards Simpler Text-to-SQL Models

Duorat: نحو نماذج Simpler Text-to-SQL

833 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

simpler effectively translate natural translate natural language أبسط ترجمة فعالة الطبيعية ترجمة اللغة الطبيعية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. Working mostly on the Spider dataset, researchers have proposed increasingly sophisticated solutions to the problem. Contrary to this trend, in this paper we focus on simplifications. We begin by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAT-SQL is using only relation-aware or vanilla transformers as the building blocks. We perform several ablation experiments using DuoRAT as the baseline model. Our experiments confirm the usefulness of some techniques and point out the redundancy of others, including structural SQL features and features that link the question with the schema.

References used

https://aclanthology.org/

rate research

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

669 - Association for Computation Linguistics 2021 مقالة

Most available semantic parsing datasets, comprising of pairs of natural utterances and logical forms, were collected solely for the purpose of training and evaluation of natural language understanding systems. As a result, they do not contain any of the richness and variety of natural-occurring utterances, where humans ask about data they need or are curious about. In this work, we release SEDE, a dataset with 12,023 pairs of utterances and SQL queries collected from real usage on the Stack Exchange website. We show that these pairs contain a variety of real-world challenges which were rarely reflected so far in any other semantic parsing dataset, propose an evaluation metric based on comparison of partial query clauses that is more suitable for real-world queries, and conduct experiments with strong baselines, showing a large gap between the performance on SEDE compared to other common datasets.

stack exchange data naturally-occurring dataset based stack exchange بيانات التبادل المكدس لحالات البيانات التي تحدث بشكل طبيعي كومة البورصة صناعة حمض الفوسفور المزيد..

Structure-Grounded Pretraining for Text-to-SQL

885 - Association for Computation Linguistics 2021 مقالة

Learning to capture text-table alignment is essential for tasks like text-to-SQL. A model needs to correctly recognize natural language references to columns and values and to ground them in the given database schema. In this paper, we present a nove l weakly supervised Structure-Grounded pretraining framework (STRUG) for text-to-SQL that can effectively learn to capture text-table alignment based on a parallel text-table corpus. We identify a set of novel pretraining tasks: column grounding, value grounding and column-value mapping, and leverage them to pretrain a text-table encoder. Additionally, to evaluate different methods under more realistic text-table alignment settings, we create a new evaluation set Spider-Realistic based on Spider dev set with explicit mentions of column names removed, and adopt eight existing text-to-SQL datasets for cross-database evaluation. STRUG brings significant improvement over BERTLARGE in all settings. Compared with existing pretraining methods such as GRAPPA, STRUG achieves similar performance on Spider, and outperforms all baselines on more realistic sets. All the code and data used in this work will be open-sourced to facilitate future research.

capture text-table alignment text-table alignment text-table التقاط محاذاة جدول النص محاذاة جدول النص نص نص صناعة حمض الفوسفور المزيد..

Attainable Text-to-Text Machine Translation vs. Translation: Issues Beyond Linguistic Processing

1211 - Association for Computation Linguistics 2021 مقالة

Existing approaches for machine translation (MT) mostly translate given text in the source language into the target language and without explicitly referring to information indispensable for producing proper translation. This includes not only inform ation in other textual elements and modalities than texts in the same document and but also extra-document and non-linguistic information and such as norms and skopos. To design better translation production work-flows and we need to distinguish translation issues that could be resolved by the existing text-to-text approaches and those beyond them. To this end and we conducted an analytic assessment of MT outputs and taking an English-to-Japanese news translation task as a case study. First and examples of translation issues and their revisions were collected by a two-stage post-editing (PE) method: performing minimal PE to obtain translation attainable based on the given textual information and further performing full PE to obtain truly acceptable translation referring to any information if necessary. Then and the collected revision examples were manually analyzed. We revealed dominant issues and information indispensable for resolving them and such as fine-grained style specifications and terminology and domain-specific knowledge and and reference documents and delineating a clear distinction between translation and what text-to-text MT can ultimately attain.

linguistic processing المعالجة اللغوية صناعة حمض الفوسفور

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

751 - Association for Computation Linguistics 2021 مقالة

Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specif ically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts in the text descriptions; (2) removing the need of nested subqueries and set operators; and (3) making the schema linking easier by reducing the required number of schema items. On Spider, a challenging text-to-SQL benchmark that contains complex and nested SQL queries, we demonstrate that NatSQL outperforms other IRs, and significantly improves the performance of several previous SOTA models. Furthermore, for existing models that do not support executable SQL generation, NatSQL easily enables them to generate executable SQL queries, and achieves the new state-of-the-art execution accuracy.

natural language specifications language specifications making sql easier مواصفات اللغة الطبيعية مواصفات اللغة جعل SQL أسهل صناعة حمض الفوسفور المزيد..

Semi-Automatic Construction of Text-to-SQL Data for Domain Transfer

865 - Association for Computation Linguistics 2021 مقالة

Strong and affordable in-domain data is a desirable asset when transferring trained semantic parsers to novel domains. As previous methods for semi-automatically constructing such data cannot handle the complexity of realistic SQL queries, we propose to construct SQL queries via context-dependent sampling, and introduce the concept of topic. Along with our SQL query construction method, we propose a novel pipeline of semi-automatic Text-to-SQL dataset construction that covers the broad space of SQL queries. We show that the created dataset is comparable with expert annotation along multiple dimensions, and is capable of improving domain transfer performance for SOTA semantic parsers.

domain transfer sql queries sql نقل المجال استفسارات SQL. مقدم SQL. صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

DuoRAT: Towards Simpler Text-to-SQL Models

Duorat: نحو نماذج Simpler Text-to-SQL

Ask ChatGPT about the research

Read More

suggested questions