في حين أن فهم اللغة الطبيعية لا يزال الفهم المستندات الطويلة تحديا مفتوحا، غالبا ما تحتوي هذه الوثائق على معلومات هيكلية يمكنها إبلاغ تصميم النماذج التي ترميزها.البرامج النصية للأفلام هي مثال لمثل هذه النصوص النيكلية منظم، يتم تجزئة البرامج النصية في مشاهد، والتي تتحلل في الحوار والمكونات الوصفية.في هذا العمل، نقترح بنية عصبية لتشفير هذا الهيكل، والذي ينفذ بقوة على مهام تصنيف العلامات متعددة الملصقات دون استخدام ميزات يدويا.نضيف طبقة من البصيرة عن طريق زيادة وحدة الترجمة ذات القدرة على الترجمة الترجمة الترجمة غير المنشطة، والتي يمكن استخدامها لاستخراج وتصور المسارات السردية.على الرغم من أن هذا العمل يتناول screenplays على وجه التحديد، فإننا نناقش كيف يمكن تعميم النهج الأساسي لمجموعة من الوثائق المهيكلة.
While natural language understanding of long-form documents remains an open challenge, such documents often contain structural information that can inform the design of models encoding them. Movie scripts are an example of such richly structured text -- scripts are segmented into scenes, which decompose into dialogue and descriptive components. In this work, we propose a neural architecture to encode this structure, which performs robustly on two multi-label tag classification tasks without using handcrafted features. We add a layer of insight by augmenting the encoder with an unsupervised interpretability' module, which can be used to extract and visualize narrative trajectories. Though this work specifically tackles screenplays, we discuss how the underlying approach can be generalized to a range of structured documents.
References used
https://aclanthology.org/
As hate speech spreads on social media and online communities, research continues to work on its automatic detection. Recently, recognition performance has been increasing thanks to advances in deep learning and the integration of user features. This
Human conversations naturally evolve around different topics and fluently move between them. In research on dialog systems, the ability to actively and smoothly transition to new topics is often ignored. In this paper we introduce TIAGE, a new topic-
State-of-the-art approaches to spelling error correction problem include Transformer-based Seq2Seq models, which require large training sets and suffer from slow inference time; and sequence labeling models based on Transformer encoders like BERT, wh
Recent metaphor identification approaches mainly consider the contextual text features within a sentence or introduce external linguistic features to the model. But they usually ignore the extra information that the data can provide, such as the cont
Short text classification is a fundamental task in natural language processing. It is hard due to the lack of context information and labeled data in practice. In this paper, we propose a new method called SHINE, which is based on graph neural networ