New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis

كل شيء له سبب: الاستفادة من الاستدلال السببية في تحليل النص القانوني

317 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Causal inference is the process of capturing cause-effect relationship among variables. Most existing works focus on dealing with structured data, while mining causal relationship among factors from unstructured data, like text, has been less examined, but is of great importance, especially in the legal domain. In this paper, we propose a novel Graph-based Causal Inference (GCI) framework, which builds causal graphs from fact descriptions without much human involvement and enables causal inference to facilitate legal practitioners to make proper decisions. We evaluate the framework on a challenging similar charge disambiguation task. Experimental results show that GCI can capture the nuance from fact descriptions among multiple confusing charges and provide explainable discrimination, especially in few-shot settings. We also observe that the causal knowledge contained in GCI can be effectively injected into powerful neural networks for better performance and interpretability.

References used

https://aclanthology.org/

rate research

JuriBERT: A Masked-Language Model Adaptation for French Legal Text

337 - Association for Computation Linguistics 2021 مقالة

Language models have proven to be very useful when adapted to specific domains. Nonetheless, little research has been done on the adaptation of domain-specific BERT models in the French language. In this paper, we focus on creating a language model a dapted to French legal text with the goal of helping law professionals. We conclude that some specific tasks do not benefit from generic language models pre-trained on large amounts of data. We explore the use of smaller architectures in domain-specific sub-languages and their benefits for French legal text. We prove that domain-specific pre-trained models can perform better than their equivalent generalised ones in the legal domain. Finally, we release JuriBERT, a new set of BERT models adapted to the French legal domain.

french legal text french legal french النص القانوني الفرنسي اللغة الفرنسية القانونية الفرنسية صناعة حمض الفوسفور المزيد..

Text Style Transfer: Leveraging a Style Classifier on Entangled Latent Representations

590 - Association for Computation Linguistics 2021 مقالة

Learning a good latent representation is essential for text style transfer, which generates a new sentence by changing the attributes of a given sentence while preserving its content. Most previous works adopt disentangled latent representation learn ing to realize style transfer. We propose a novel text style transfer algorithm with entangled latent representation, and introduce a style classifier that can regulate the latent structure and transfer style. Moreover, our algorithm for style transfer applies to both single-attribute and multi-attribute transfer. Extensive experimental results show that our method generally outperforms state-of-the-art approaches.

text style transfer style transfer latent representation نقل نمط النص نقل النمط التمثيل الكامن صناعة حمض الفوسفور المزيد..

GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

511 - Association for Computation Linguistics 2021 مقالة

Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for fine-tuning but lacks data and i nference scalability. This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text samples from a mixture of real samples. We also propose utilizing soft-labels predicted by the language models, effectively distilling knowledge from the large-scale language models and creating textual perturbations simultaneously. We perform data augmentation experiments on diverse classification tasks and show that our method hugely outperforms existing text augmentation methods. We also conduct experiments on our newly proposed benchmark to show that the augmentation effect is not only attributed to memorization. Further ablation studies and a qualitative analysis provide more insights into our approach.

leveraging large-scale language الاستفادة من اللغة واسعة النطاق صناعة حمض الفوسفور

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

339 - Association for Computation Linguistics 2021 مقالة

Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English. While the advance of pretrained multilingual enc oders suggests an easy optimism of train on English, run on any language'', we find through a thorough exploration and extension of techniques that a combination of approaches, both new and old, leads to better performance than any one cross-lingual strategy in particular. We explore techniques including data projection and self-training, and how different pretrained encoders impact them. We use English-to-Arabic IE as our initial example, demonstrating strong performance in this setting for event extraction, named entity recognition, part-of-speech tagging, and dependency parsing. We then apply data projection and self-training to three tasks across eight target languages. Because no single set of techniques performs the best across all tasks, we encourage practitioners to explore various configurations of the techniques described in this work when seeking to improve on zero-shot training.

cross-lingual information extraction zero-shot cross-lingual information استخراج المعلومات اللغوية معلومات صفرية صناعة حمض الفوسفور

Few-shot and Zero-shot Approaches to Legal Text Classification: A Case Study in the Financial Sector

440 - Association for Computation Linguistics 2021 مقالة

The application of predictive coding techniques to legal texts has the potential to greatly reduce the cost of legal review of documents, however, there is such a wide array of legal tasks and continuously evolving legislation that it is hard to cons truct sufficient training data to cover all cases. In this paper, we investigate few-shot and zero-shot approaches that require substantially less training data and introduce a triplet architecture, which for promissory statements produces performance close to that of a supervised system. This method allows predictive coding methods to be rapidly developed for new regulations and markets.

legal text classification financial sector تصنيف النص القانوني صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis

كل شيء له سبب: الاستفادة من الاستدلال السببية في تحليل النص القانوني

Ask ChatGPT about the research

Read More

suggested questions