Semi-Automated Labeling of Requirement Datasets for Relation Extraction

published by Jannik Fischbach in 2021 in Informatics Engineering and research's language is English Download

Abstract in English

Creating datasets manually by human annotators is a laborious task that can lead to biased and inhomogeneous labels. We propose a flexible, semi-automatic framework for labeling data for relation extraction. Furthermore, we provide a dataset of preprocessed sentences from the requirements engineering domain, including a set of automatically created as well as hand-crafted labels. In our case study, we compare the human and automatic labels and show that there is a substantial overlap between both annotations.

Download