غالبا ما يتم انتقاد حلول التعلم الآلية لعدم وجود شرح لنجاحاتها وفشلها. فهم المثيلات التي يتم إساءة استخدامها ولماذا ضرورية لتحسين عملية التعلم. يساعد هذا العمل في ملء هذه الفجوة من خلال اقتراح منهجية تميز، حدد وقياس تأثير مثيلات صعبة في مهمة تصنيف القطبية لمراجعات الأفلام. نحن نميز هذه الحالات إلى فئتين: الحياد، حيث لا ينقل النص قطبية واضحة، والتناقض، حيث يكون قطبية النص هو عكس تصنيفها الحقيقي. نحدد عدد الحالات الصعبة في تصنيف القطبية لمراجعات الأفلام وتوفير الأدلة التجريبية حول الحاجة إلى الانتباه إلى مثل هذه الحالات الإشكالية، لأنها أصعب بكثير تصنيفها، لكلا الجهازين والصفوف البشري. إلى حد ما من معرفتنا، هذا هو أول تحليل منهجي لتأثير المثيلات الصلبة في الكشف عن القطبية من الاستعراضات النصية المكونة بشكل جيد.
Machine learning solutions are often criticized for the lack of explanation of their successes and failures. Understanding which instances are misclassified and why is essential to improve the learning process. This work helps to fill this gap by proposing a methodology to characterize, quantify and measure the impact of hard instances in the task of polarity classification of movie reviews. We characterize such instances into two categories: neutrality, where the text does not convey a clear polarity, and discrepancy, where the polarity of the text is the opposite of its true rating. We quantify the number of hard instances in polarity classification of movie reviews and provide empirical evidence about the need to pay attention to such problematic instances, as they are much harder to classify, for both machine and human classifiers. To the best of our knowledge, this is the first systematic analysis of the impact of hard instances in polarity detection from well-formed textual reviews.
References used
https://aclanthology.org/
Document-level relation extraction is a challenging task, requiring reasoning over multiple sentences to predict a set of relations in a document. In this paper, we propose a novel framework E2GRE (Entity and Evidence Guided Relation Extraction) that
Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information. Nevertheless, existing approaches 1) simply introduce the representations of context sentences without expli
Document-level event extraction is critical to various natural language processing tasks for providing structured information. Existing approaches by sequential modeling neglect the complex logic structures for long texts. In this paper, we leverage
Document-level relation extraction aims to identify relations between entities in a whole document. Prior efforts to capture long-range dependencies have relied heavily on implicitly powerful representations learned through (graph) neural networks, w
Extracting relations across large text spans has been relatively underexplored in NLP, but it is particularly important for high-value domains such as biomedicine, where obtaining high recall of the latest findings is crucial for practical applicatio