Head-driven Phrase Structure Parsing in O($n^3$) Time Complexity

73 0 0.0 ( 0 )

Download Cite

Added by Zuchao Li

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Zuchao Li - Junru Zhou - Hai Zhao

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Constituent and dependency parsing, the two classic forms of syntactic parsing, have been found to benefit from joint training and decoding under a uniform formalism, Head-driven Phrase Structure Grammar (HPSG). However, decoding this unified grammar has a higher time complexity ($O(n^5)$) than decoding either form individually ($O(n^3)$) since more factors have to be considered during decoding. We thus propose an improved head scorer that helps achieve a novel performance-preserved parser in $O$($n^3$) time complexity. Furthermore, on the basis of this proposed practical HPSG parser, we investigated the strengths of HPSG-based parsing and explored the general method of training an HPSG-based parser from only a constituent or dependency annotations in a multilingual scenario. We thus present a more effective, more in-depth, and general work on HPSG parsing.

rate research

Neural Phrase-to-Phrase Machine Translation

278 - Jiangtao Feng , Lingpeng Kong , Po-Sen Huang 2018

In this paper, we propose Neural Phrase-to-Phrase Machine Translation (NP$^2$MT). Our model uses a phrase attention mechanism to discover relevant input (source) segments that are used by a decoder to generate output (target) phrases. We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018). Furthermore, our method can naturally integrate with external phrase dictionaries during decoding. Empirical experiments show that our method achieves comparable performance with the state-of-the art methods on benchmark datasets. However, when the training and testing data are from different distributions or domains, our method performs better.

Computation and Language Machine Learning Machine Learning

On Implementing an HPSG theory -- Aspects of the logical architecture, the formalization, and the implementation of head-driven phrase structure grammars

57 - Walt Detmar Meurers 1994

The paper presents some aspects involved in the formalization and implementation of HPSG theories. As basis, the logical setups of Carpenter (1992) and King (1989, 1994) are briefly compared regarding their usefulness as basis for HPSGII (Pollard and Sag 1994). The possibilities for expressing HPSG theories in the HPSGII architecture and in various computational systems (ALE, Troll, CUF, and TFS) are discussed. Beside a formal characterization of the possibilities, the paper investigates the specific choices for constraints with certain linguistic motivations, i.e. the lexicon, structure licencing, and grammatical principles. An ALE implementation of a theory for German proposed by Hinrichs and Nakazawa (1994) is used as example and the ALE grammar is included in the appendix.

Computation and Language

Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models

82 - Matteo Alleman , Jonathan Mamou , Miguel A Del Rio 2021

While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings. In particular, it is not entirely clear what aspects of sentence-level syntax are captured by these representations, nor how (if at all) they are built along the stacked layers of the network. In this paper, we aim to address such questions with a general class of interventional, input perturbation-based analyses of representations from pretrained language models. Importing from computational and cognitive neuroscience the notion of representational invariance, we perform a series of probes designed to test the sensitivity of these representations to several kinds of structure in sentences. Each probe involves swapping words in a sentence and comparing the representations from perturbed sentences against the original. We experiment with three different perturbations: (1) random permutations of n-grams of varying width, to test the scale at which a representation is sensitive to word position; (2) swapping of two spans which do or do not form a syntactic phrase, to test sensitivity to global phrase structure; and (3) swapping of two adjacent words which do or do not break apart a syntactic phrase, to test sensitivity to local phrase structure. Results from these probes collectively suggest that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process. More broadly, our results also indicate that structured input perturbations widens the scope of analyses that can be performed on often-opaque deep learning systems, and can serve as a complement to existing tools (such as supervised linear probes) for interpreting complex black-box models.

Computation and Language

An $O(log^{3/2}n)$ Parallel Time Population Protocol for Majority with $O(log n)$ States

133 - Stav Ben-Nun , Tsvi Kopelowitz , Matan Kraus 2020

In population protocols, the underlying distributed network consists of $n$ nodes (or agents), denoted by $V$, and a scheduler that continuously selects uniformly random pairs of nodes to interact. When two nodes interact, their states are updated by applying a state transition function that depends only on the states of the two nodes prior to the interaction. The efficiency of a population protocol is measured in terms of both time (which is the number of interactions until the nodes collectively have a valid output) and the number of possible states of nodes used by the protocol. By convention, we consider the parallel time cost, which is the time divided by $n$. In this paper we consider the majority problem, where each node receives as input a color that is either black or white, and the goal is to have all of the nodes output the color that is the majority of the input colors. We design a population protocol that solves the majority problem in $O(log^{3/2}n)$ parallel time, both with high probability and in expectation, while using $O(log n)$ states. Our protocol improves on a recent protocol of Berenbrink et al. that runs in $O(log^{5/3}n)$ parallel time, both with high probability and in expectation, using $O(log n)$ states.

Distributed Parallel and Cluster Computing Data Structures and Algorithms

DRTS Parsing with Structure-Aware Encoding and Decoding

66 - Qiankun Fu , Yue Zhang , Jiangming Liu 2020

Discourse representation tree structure (DRTS) parsing is a novel semantic parsing task which has been concerned most recently. State-of-the-art performance can be achieved by a neural sequence-to-sequence model, treating the tree construction as an incremental sequence generation problem. Structural information such as input syntax and the intermediate skeleton of the partial output has been ignored in the model, which could be potentially useful for the DRTS parsing. In this work, we propose a structural-aware model at both the encoder and decoder phase to integrate the structural information, where graph attention network (GAT) is exploited for effectively modeling. Experimental results on a benchmark dataset show that our proposed model is effective and can obtain the best performance in the literature.

Computation and Language