No Arabic abstract
On the WikiSQL benchmark, state-of-the-art text-to-SQL systems typically take a slot-filling approach by building several dedicated models for each type of slots. Such modularized systems are not only complex butalso of limited capacity for capturing inter-dependencies among SQL clauses. To solve these problems, this paper proposes a novel extraction-linking approach, where a unified extractor recognizes all types of slot mentions appearing in the question sentence before a linker maps the recognized columns to the table schema to generate executable SQL queries. Trained with automatically generated annotations, the proposed method achieves the first place on the WikiSQL benchmark.
When translating natural language questions into SQL queries to answer questions from a database, contemporary semantic parsing models struggle to generalize to unseen database schemas. The generalization challenge lies in (a) encoding the database relations in an accessible way for the semantic parser, and (b) modeling alignment between database columns and their mentions in a given query. We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder. On the challenging Spider dataset this framework boosts the exact match accuracy to 57.2%, surpassing its best counterparts by 8.7% absolute improvement. Further augmented with BERT, it achieves the new state-of-the-art performance of 65.6% on the Spider leaderboard. In addition, we observe qualitative improvements in the models understanding of schema linking and alignment. Our implementation will be open-sourced at https://github.com/Microsoft/rat-sql.
Document-level relation extraction aims to discover relations between entities across a whole document. How to build the dependency of entities from different sentences in a document remains to be a great challenge. Current approaches either leverage syntactic trees to construct document-level graphs or aggregate inference information from different sentences. In this paper, we build cross-sentence dependencies by inferring compositional relations between inter-sentence mentions. Adopting aggressive linking strategy, intermediate relations are reasoned on the document-level graphs by mention convolution. We further notice the generalization problem of NA instances, which is caused by incomplete annotation and worsened by fully-connected mention pairs. An improved ranking loss is proposed to attend this problem. Experiments show the connections between different mentions are crucial to document-level relation extraction, which enables the model to extract more meaningful higher-level compositional relations.
Entities, as the essential elements in relation extraction tasks, exhibit certain structure. In this work, we formulate such structure as distinctive dependencies between mention pairs. We then propose SSAN, which incorporates these structural dependencies within the standard self-attention mechanism and throughout the overall encoding stage. Specifically, we design two alternative transformation modules inside each self-attention building block to produce attentive biases so as to adaptively regularize its attention flow. Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN. It significantly outperforms competitive baselines, achieving new state-of-the-art results on three popular document-level relation extraction datasets. We further provide ablation and visualization to show how the entity structure guides the model for better relation extraction. Our code is publicly available.
Text-to-SQL is a crucial task toward developing methods for understanding natural language by computers. Recent neural approaches deliver excellent performance; however, models that are difficult to interpret inhibit future developments. Hence, this study aims to provide a better approach toward the interpretation of neural models. We hypothesize that the internal behavior of models at hand becomes much easier to analyze if we identify the detailed performance of schema linking simultaneously as the additional information of the text-to-SQL performance. We provide the ground-truth annotation of schema linking information onto the Spider dataset. We demonstrate the usefulness of the annotated data and how to analyze the current state-of-the-art neural models.
We study the problem of query attribute value extraction, which aims to identify named entities from user queries as diverse surface form attribute values and afterward transform them into formally canonical forms. Such a problem consists of two phases: {named entity recognition (NER)} and {attribute value normalization (AVN)}. However, existing works only focus on the NER phase but neglect equally important AVN. To bridge this gap, this paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO, which involves both two phases. Moreover, by leveraging large-scale weakly-labeled behavior data, we further improve the extraction performance with less supervision cost. Specifically, for the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels to refine the weakly-labeled data for training a student network. Meanwhile, the teacher network can be dynamically adapted by the feedback of the students performance on strongly-labeled data to maximally denoise the noisy supervisions from the weak labels. For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products. Extensive experiments on a real-world large-scale E-commerce dataset demonstrate the effectiveness of QUEACO.