Do you want to publish a course? Click here

Good-Enough Example Extrapolation

استعباد مثال جيد بما فيه الكفاية

133   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

This paper asks whether extrapolating the hidden space distribution of text examples from one class onto another is a valid inductive bias for data augmentation. To operationalize this question, I propose a simple data augmentation protocol called good-enough example extrapolation'' (GE3). GE3 is lightweight and has no hyperparameters. Applied to three text classification datasets for various data imbalance scenarios, GE3 improves performance more than upsampling and other hidden-space data augmentation methods.

References used
https://aclanthology.org/
rate research

Read More

This paper identifies some common and specific pitfalls in the development of sign language technologies targeted at deaf communities, with a specific focus on signing avatars. It makes the call to urgently interrogate some of the ideologies behind t hose technologies, including issues of ethical and responsible development. The paper addresses four separate and interlinked issues: ideologies about deaf people and mediated communication, bias in data sets and learning, user feedback, and applications of the technologies. The paper ends with several take away points for both technology developers and deaf NGOs. Technology developers should give more consideration to diversifying their team and working interdisciplinary, and be mindful of the biases that inevitably creep into data sets. There should also be a consideration of the technologies' end users. Sign language interpreters are not the end users nor should they be seen as the benchmark for language use. Technology developers and deaf NGOs can engage in a dialogue about how to prioritize application domains and prioritize within application domains. Finally, deaf NGOs policy statements will need to take a longer view, and use avatars to think of a significantly better system compared to what sign language interpreting services can provide.
Healthcare predictive analytics aids medical decision-making, diagnosis prediction and drug review analysis. Therefore, prediction accuracy is an important criteria which also necessitates robust predictive language models. However, the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans. Recent efforts of generating adversaries using rule-based synonyms and BERT-MLMs have been witnessed in general domain, but the ever-increasing biomedical literature poses unique challenges. We propose BBAEG (Biomedical BERT-based Adversarial Example Generation), a black-box attack algorithm for biomedical text classification, leveraging the strengths of both domain-specific synonym replacement for biomedical named entities and BERT-MLM predictions, spelling variation and number replacement. Through automatic and human evaluation on two datasets, we demonstrate that BBAEG performs stronger attack with better language fluency, semantic coherence as compared to prior work.
Sub-tasks of intent classification, such as robustness to distribution shift, adaptation to specific user groups and personalization, out-of-domain detection, require extensive and flexible datasets for experiments and evaluation. As collecting such datasets is time- and labor-consuming, we propose to use text generation methods to gather datasets. The generator should be trained to generate utterances that belong to the given intent. We explore two approaches to the generation of task-oriented utterances: in the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training. In the one-shot approach, the model is presented with a single utterance from a test intent. We perform a thorough automatic, and human evaluation of the intrinsic properties of two-generation approaches. The attributes of the generated data are close to original test sets, collected via crowd-sourcing.
This paper describes the approach that was developed for SemEval 2021 Task 7 (Hahackathon: Incorporating Demographic Factors into Shared Humor Tasks) by the DUTH Team. We used and compared a variety of preprocessing techniques, vectorization methods, and numerous conventional machine learning algorithms, in order to construct classification and regression models for the given tasks. We used majority voting to combine the models' outputs with small Neural Networks (NN) for classification tasks and their mean for regression for improving our system's performance. While these methods proved weaker than modern, deep learning models, they are still relevant in research tasks because of their low requirements on computational power and faster training.
This paper presents a method integrating database with Jgroup based on Hibernate, which is one of Object Relational Mapping tools. We compare between the performance of Jgroup integrated with Hibernate and the performance of RMI integrated with Hibernate. The results show that Jgroup/Hibernate outperforms RMI/Hibernate when the number of clients increases.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا