Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources


Abstract in English

Sentiment analysis of user-generated reviews or comments on products and services in social networks can help enterprises to analyze the feedback from customers and take corresponding actions for improvement. To mitigate large-scale annotations on the target domain, domain adaptation (DA) provides an alternate solution by learning a transferable model from other labeled source domains. Existing multi-source domain adaptation (MDA) methods either fail to extract some discriminative features in the target domain that are related to sentiment, neglect the correlations of different sources and the distribution difference among different sub-domains even in the same source, or cannot reflect the varying optimal weighting during different training stages. In this paper, we propose a novel instance-level MDA framework, named curriculum cycle-consistent generative adversarial network (C-CycleGAN), to address the above issues. Specifically, C-CycleGAN consists of three components: (1) pre-trained text encoder which encodes textual input from different domains into a continuous representation space, (2) intermediate domain generator with curriculum instance-level adaptation which bridges the gap across source and target domains, and (3) task classifier trained on the intermediate domain for final sentiment classification. C-CycleGAN transfers source samples at instance-level to an intermediate domain that is closer to the target domain with sentiment semantics preserved and without losing discriminative features. Further, our dynamic instance-level weighting mechanisms can assign the optimal weights to different source samples in each training stage. We conduct extensive experiments on three benchmark datasets and achieve substantial gains over state-of-the-art DA approaches. Our source code is released at: https://github.com/WArushrush/Curriculum-CycleGAN.

Download