Word Embeddings, Cosine Similarity and Deep Learning for Identification of Professions \& Occupations in Health-related Social Media


Abstract in English

ProfNER-ST focuses on the recognition of professions and occupations from Twitter using Spanish data. Our participation is based on a combination of word-level embeddings, including pre-trained Spanish BERT, as well as cosine similarity computed over a subset of entities that serve as input for an encoder-decoder architecture with attention mechanism. Finally, our best score achieved an F1-measure of 0.823 in the official test set.

References used

https://aclanthology.org/

Download