Research papers, master and doctoral theses about job postings

De-identification of Privacy-related Entities in Job Postings

170 - Association for Computation Linguistics 2021 مقالة

De-identification is the task of detecting privacy-related entities in text, such as person names, emails and contact data. It has been well-studied within the medical domain. The need for de-identification technology is increasing, as privacy-preser ving data handling is in high demand in many domains. In this paper, we focus on job postings. We present JobStack, a new corpus for de-identification of personal data in job vacancies on Stackoverflow. We introduce baselines, comparing Long-Short Term Memory (LSTM) and Transformer models. To improve these baselines, we experiment with BERT representations, and distantly related auxiliary data via multi-task learning. Our results show that auxiliary data helps to improve de-identification performance. While BERT representations improve performance, surprisingly vanilla'' BERT turned out to be more effective than BERT trained on Stackoverflow-related data.

privacy-related entities detecting privacy-related entities job postings الكيانات المتعلقة بالخصوصية الكشف عن الكيانات المتعلقة بالخصوصية وظائف شاغرة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد