No Arabic abstract
We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider (71.1% dev, 67.5% test with ensemble model) and WikiSQL (92.6% dev, 91.9% test). Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our implementation is available at url{https://github.com/salesforce/TabularSemanticParsing}.
In this work, we focus on two crucial components in the cross-domain text-to-SQL semantic parsing task: schema linking and value filling. To encourage the model to learn better encoding ability, we propose a column selection auxiliary task to empower the encoder with the relevance matching capability by using explicit learning targets. Furthermore, we propose two value filling methods to build the bridge from the existing zero-shot semantic parsers to real-world applications, considering most of the existing parsers ignore the values filling in the synthesized SQL. With experiments on Spider, our proposed framework improves over the baselines on the execution accuracy and exact set match accuracy when database contents are unavailable, and detailed analysis sheds light on future work.
This paper presents a novel approach to translating natural language questions to SQL queries for given tables, which meets three requirements as a real-world data analysis application: cross-domain, multilingualism and enabling quick-start. Our proposed approach consists of: (1) a novel data abstraction step before the parser to make parsing table-agnosticism; (2) a set of semantic rules for parsing abstracted data-analysis questions to intermediate logic forms as tree derivations to reduce the search space; (3) a neural-based model as a local scoring function on a span-based semantic parser for structured optimization and efficient inference. Experiments show that our approach outperforms state-of-the-art algorithms on a large open benchmark dataset WikiSQL. We also achieve promising results on a small dataset for more complex queries in both English and Chinese, which demonstrates our language expansion and quick-start ability.
Recently, Text-to-SQL for multi-turn dialogue has attracted great interest. Here, the user input of the current turn is parsed into the corresponding SQL query of the appropriate database, given all previous dialogue history. Current approaches mostly employ end-to-end models and consequently face two challenges. First, dialogue history modeling and Text-to-SQL parsing are implicitly combined, hence it is hard to carry out interpretable analysis and obtain targeted improvement. Second, SQL annotation of multi-turn dialogue is very expensive, leading to training data sparsity. In this paper, we propose a novel decoupled multi-turn Text-to-SQL framework, where an utterance rewrite model first explicitly solves completion of dialogue context, and then a single-turn Text-to-SQL parser follows. A dual learning approach is also proposed for the utterance rewrite model to address the data sparsity problem. Compared with end-to-end approaches, the proposed decoupled method can achieve excellent performance without any annotated in-domain data. With just a few annotated rewrite cases, the decoupled method outperforms the released state-of-the-art end-to-end models on both SParC and CoSQL datasets.
The task of multi-turn text-to-SQL semantic parsing aims to translate natural language utterances in an interaction into SQL queries in order to answer them using a database which normally contains multiple table schemas. Previous studies on this task usually utilized contextual information to enrich utterance representations and to further influence the decoding process. While they ignored to describe and track the interaction states which are determined by history SQL queries and are related with the intent of current utterance. In this paper, two kinds of interaction states are defined based on schema items and SQL keywords separately. A relational graph neural network and a non-linear layer are designed to update the representations of these two states respectively. The dynamic schema-state and SQL-state representations are then utilized to decode the SQL query corresponding to current utterance. Experimental results on the challenging CoSQL dataset demonstrate the effectiveness of our proposed method, which achieves better performance than other published methods on the task leaderboard.
As a promising paradigm, interactive semantic parsing has shown to improve both semantic parsing accuracy and user confidence in the results. In this paper, we propose a new, unified formulation of the interactive semantic parsing problem, where the goal is to design a model-based intelligent agent. The agent maintains its own state as the current predicted semantic parse, decides whether and where human intervention is needed, and generates a clarification question in natural language. A key part of the agent is a world model: it takes a percept (either an initial question or subsequent feedback from the user) and transitions to a new state. We then propose a simple yet remarkably effective instantiation of our framework, demonstrated on two text-to-SQL datasets (WikiSQL and Spider) with different state-of-the-art base semantic parsers. Compared to an existing interactive semantic parsing approach that treats the base parser as a black box, our approach solicits less user feedback but yields higher run-time accuracy.