ﻻ يوجد ملخص باللغة العربية
Online reviews are a vital source of information when purchasing a service or a product. Opinion spammers manipulate these reviews, deliberately altering the overall perception of the service. Though there exists a corpus of online reviews, only a few have been labeled as spam or non-spam, making it difficult to train spam detection models. We propose an adversarial training mechanism leveraging the capabilities of Generative Pre-Training 2 (GPT-2) for classifying opinion spam with limited labeled data and a large set of unlabeled data. Experiments on TripAdvisor and YelpZip datasets show that the proposed model outperforms state-of-the-art techniques by at least 7% in terms of accuracy when labeled data is limited. The proposed model can also generate synthetic spam/non-spam reviews with reasonable perplexity, thereby, providing additional labeled data during training.
Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limi
We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an in
We consider a novel data driven approach for designing learning algorithms that can effectively learn with only a small number of labeled examples. This is crucial for modern machine learning applications where labels are scarce or expensive to obtai
Neural methods have been shown to achieve high performance in Named Entity Recognition (NER), but rely on costly high-quality labeled data for training, which is not always available across languages. While previous works have shown that unlabeled da
Attribute reduction is one of the most important research topics in the theory of rough sets, and many rough sets-based attribute reduction methods have thus been presented. However, most of them are specifically designed for dealing with either labe