Subscribe to the gold package and get unlimited access to Shamra Academy

Towards an Open Platform for Legal Information

101 0 0.0 ( 0 )

Download Cite

Added by Malte Ostendorff

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Malte Ostendorff - Till Blume - Saskia Ostendorff

Digital Libraries Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent advances in the area of legal information systems have led to a variety of applications that promise support in processing and accessing legal documents. Unfortunately, these applications have various limitations, e.g., regarding scope or extensibility. Furthermore, we do not observe a trend towards open access in digital libraries in the legal domain as we observe in other domains, e.g., economics of computer science. To improve open access in the legal domain, we present our approach for an open source platform to transparently process and access Legal Open Data. This enables the sustainable development of legal applications by offering a single technology stack. Moreover, the approach facilitates the development and deployment of new technologies. As proof of concept, we implemented six technologies and generated metadata for more than 250,000 German laws and court decisions. Thus, we can provide users of our platform not only access to legal documents, but also the contained information.

rate research

Towards FAIR Principles for Open Hardware

234 - Nadica Miljkovic , Ana Trisovic , Limor Peer 2021

The lack of scientific openness is identified as one of the key challenges of computational reproducibility. In addition to Open Data, Free and Open-source Software (FOSS) and Open Hardware (OH) can address this challenge by introducing open policies, standards, and recommendations. However, while both FOSS and OH are free to use, study, modify, and redistribute, there are significant differences in sharing and reusing these artifacts. FOSS is increasingly supported with software repositories, but support for OH is lacking, potentially due to the complexity of its digital format and licensing. This paper proposes leveraging FAIR principles to make OH findable, accessible, interoperable, and reusable. We define what FAIR means for OH, how it differs from FOSS, and present examples of unique demands. Also, we evaluate dissemination platforms currently used for OH and provide recommendations.

Digital Libraries Software Engineering

AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

79 - Niklas Friedrich , Kiril Gashteovski , Mingying Yu 2021

Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: the ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from a sentence. To measure performance of OIE systems more realistically, it is necessary to manually annotate complete facts (i.e., clusters of all acceptable surface realizations of the same fact) from input sentences. We propose AnnIE: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks. AnnIE is modular and flexible in order to support different use case scenarios (i.e., benchmarks covering different types of facts). We use AnnIE to build two complete OIE benchmarks: one with verb-mediated facts and another with facts encompassing named entities. Finally, we evaluate several OIE systems on our complete benchmarks created with AnnIE. Our results suggest that existing incomplete benchmarks are overly lenient, and that OIE systems are not as robust as previously reported. We publicly release AnnIE under non-restrictive license.

Computation and Language

CitationIE: Leveraging the Citation Graph for Scientific Information Extraction

102 - Vijay Viswanathan , Graham Neubig , Pengfei Liu 2021

Automatically extracting key information from scientific documents has the potential to help scientists work more efficiently and accelerate the pace of scientific progress. Prior work has considered extracting document-level entity clusters and relations end-to-end from raw scientific text, which can improve literature search and help identify methods and materials for a given problem. Despite the importance of this task, most existing works on scientific information extraction (SciIE) consider extraction solely based on the content of an individual paper, without considering the papers place in the broader literature. In contrast to prior work, we augment our text representations by leveraging a complementary source of document context: the citation graph of referential links between citing and cited papers. On a test set of English-language scientific documents, we show that simple ways of utilizing the structure and content of the citation graph can each lead to significant gains in different scientific information extraction tasks. When these tasks are combined, we observe a sizable improvement in end-to-end information extraction over the state-of-the-art, suggesting the potential for future work along this direction. We release software tools to facilitate citation-aware SciIE development.

Digital Libraries Computation and Language

Corpus Conversion Service: A machine learning platform to ingest documents at scale [Poster abstract]

66 - Peter W J Staar , Michele Dolfi , Christoph Auer 2018

Over the past few decades, the amount of scientific articles and technical literature has increased exponentially in size. Consequently, there is a great need for systems that can ingest these documents at scale and make their content discoverable. Unfortunately, both the format of these documents (e.g. the PDF format or bitmap images) as well as the presentation of the data (e.g. complex tables) make the extraction of qualitative and quantitive data extremely challenging. We present a platform to ingest documents at scale which is powered by Machine Learning techniques and allows the user to train custom models on document collections. We show precision/recall results greater than 97% with regard to conversion to structured formats, as well as scaling evidence for each of the microservices constituting the platform.

Digital Libraries Computation and Language Computer Vision and Pattern Recognition

CORD-19: The COVID-19 Open Research Dataset

199 - Lucy Lu Wang , Kyle Lo , Yoganand Chandrasekhar 2020

The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many COVID-19 text mining and discovery systems. In this article, we describe the mechanics of dataset construction, highlighting challenges and key design decisions, provide an overview of how CORD-19 has been used, and describe several shared tasks built around the dataset. We hope this resource will continue to bring together the computing community, biomedical experts, and policy makers in the search for effective treatments and management policies for COVID-19.

Digital Libraries Computation and Language

comments

Fetching comments

Middle East University- Jordan

Additional details More universities

Towards an Open Platform for Legal Information

Ask ChatGPT about the research

No Arabic abstract

Read More