Do you want to publish a course? Click here

Reviewing contracts is a time-consuming procedure that incurs large expenses to companies and social inequality to those who cannot afford it. In this work, we propose document-level natural language inference (NLI) for contracts'', a novel, real-wor ld application of NLI that addresses such problems. In this task, a system is given a set of hypotheses (such as Some obligations of Agreement may survive termination.'') and a contract, and it is asked to classify whether each hypothesis is entailed by'', contradicting to'' or not mentioned by'' (neutral to) the contract as well as identifying evidence'' for the decision as spans in the contract. We annotated and release the largest corpus to date consisting of 607 annotated contracts. We then show that existing models fail badly on our task and introduce a strong baseline, which (a) models evidence identification as multi-label classification over spans instead of trying to predict start and end tokens, and (b) employs more sophisticated context segmentation for dealing with long documents. We also show that linguistic characteristics of contracts, such as negations by exceptions, are contributing to the difficulty of this task and that there is much room for improvement.
Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited availabl e human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python version we developed (ported) which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses Optuna hyper-parameter optimisation framework to fine-tune hLEPOR weighting parameters towards better agreement to pre-trained language models (using LaBSE) regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards professional human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU. Official results show that our submissions win three language pairs including English-German and Chinese-English on News domain via cushLEPOR(LM) and English-Russian on TED domain via hLEPOR. (data available at https://github.com/poethan/cushLEPOR)
Temporal language grounding (TLG) aims to localize a video segment in an untrimmed video based on a natural language description. To alleviate the expensive cost of manual annotations for temporal boundary labels,we are dedicated to the weakly superv ised setting, where only video-level descriptions are provided for training. Most of the existing weakly supervised methods generate a candidate segment set and learn cross-modal alignment through a MIL-based framework. However, the temporal structure of the video as well as the complicated semantics in the sentence are lost during the learning. In this work, we propose a novel candidate-free framework: Fine-grained Semantic Alignment Network (FSAN), for weakly supervised TLG. Instead of view the sentence and candidate moments as a whole, FSAN learns token-by-clip cross-modal semantic alignment by an iterative cross-modal interaction module, generates a fine-grained cross-modal semantic alignment map, and performs grounding directly on top of the map. Extensive experiments are conducted on two widely-used benchmarks: ActivityNet-Captions, and DiDeMo, where our FSAN achieves state-of-the-art performance.
Recent methods based on pre-trained language models have shown strong supervised performance on commonsense reasoning. However, they rely on expensive data annotation and time-consuming training. Thus, we focus on unsupervised commonsense reasoning. We show the effectiveness of using a common framework, Natural Language Inference (NLI), to solve diverse commonsense reasoning tasks. By leveraging transfer learning from large NLI datasets, and injecting crucial knowledge from commonsense sources such as ATOMIC 2020 and ConceptNet, our method achieved state-of-the-art unsupervised performance on two commonsense reasoning tasks: WinoWhy and CommonsenseQA. Further analysis demonstrated the benefits of multiple categories of knowledge, but problems about quantities and antonyms are still challenging.
In the Vision-and-Language Navigation (VLN) task an embodied agent navigates a 3D environment, following natural language instructions. A challenge in this task is how to handle off the path' scenarios where an agent veers from a reference path. Prio r work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction. Furthermore, the evaluation metrics employed by prior work do not measure how much of a language instruction the agent is able to follow. In this work, we propose a simple and effective language-aligned supervision scheme, and a new metric that measures the number of sub-instructions the agent has completed during navigation.
We explore the use of large pretrained language models as few-shot semantic parsers. The goal in semantic parsing is to generate a structured meaning representation given a natural language input. However, language models are trained to generate natu ral language. To bridge the gap, we use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation. Our results demonstrate that with only a small amount of data and very little code to convert into English-like representations, our blueprint for rapidly bootstrapping semantic parsers leads to surprisingly effective performance on multiple community tasks, greatly exceeding baseline methods also trained on the same limited data.
This paper presents a technique for the identification of participant slots in English language contracts. Taking inspiration from unsupervised slot extraction techniques, the system presented here uses a supervised approach to identify terms used to refer to a genre-specific slot in novel contracts. We evaluate the system in multiple feature configurations to demonstrate that the best performing system in both genres of contracts omits the exact mention form from consideration---even though such mention forms are often the name of the slot under consideration---and is instead based solely on the dependency label and parent; in other words, a more reliable quantification of a party's role in a contract is found in what they do rather than what they are named.
Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by query. To address this task, existing methods can be roughly grouped into two groups: 1) propose-and-rank models first define a set of hand-designed moment candidates and then find out the best-matching one. 2) proposal-free models directly predict two temporal boundaries of the referential moment from frames. Currently, almost all the propose-and-rank methods have inferior performance than proposal-free counterparts. In this paper, we argue that the performance of propose-and-rank models are underestimated due to the predefined manners: 1) Hand-designed rules are hard to guarantee the complete coverage of targeted segments. 2) Densely sampled candidate moments cause redundant computation and degrade the performance of ranking process. To this end, we propose a novel model termed LPNet (Learnable Proposal Network for NLVL) with a fixed set of learnable moment proposals. The position and length of these proposals are dynamically adjusted during training process. Moreover, a boundary-aware loss has been proposed to leverage frame-level information and further improve performance. Extensive ablations on two challenging NLVL benchmarks have demonstrated the effectiveness of LPNet over existing state-of-the-art methods.
Gender is widely discussed in the context of language tasks and when examining the stereotypes propagated by language models. However, current discussions primarily treat gender as binary, which can perpetuate harms such as the cyclical erasure of no n-binary gender identities. These harms are driven by model and dataset biases, which are consequences of the non-recognition and lack of understanding of non-binary genders in society. In this paper, we explain the complexity of gender and language around it, and survey non-binary persons to understand harms associated with the treatment of gender as binary in English language technologies. We also detail how current language representations (e.g., GloVe, BERT) capture and perpetuate these harms and related challenges that need to be acknowledged and addressed for representations to equitably encode gender information.
We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective. First, instead of considering documents in isolation, we pretrain over sets of multiple related documents, encouraging the model to learn cross-document relationships. Second, we improve over recent long-range transformers by introducing dynamic global attention that has access to the entire input to predict masked tokens. We release CDLM (Cross-Document Language Model), a new general language model for multi-document setting that can be easily applied to downstream tasks. Our extensive analysis shows that both ideas are essential for the success of CDLM, and work in synergy to set new state-of-the-art results for several multi-text tasks.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا