Domain generalization aims to learn an invariant model that can generalize well to the unseen target domain. In this paper, we propose to tackle the problem of domain generalization by delivering an effective framework named Variational Disentanglement Network (VDN), which is capable of disentangling the domain-specific features and task-specific features, where the task-specific features are expected to be better generalized to unseen but related test data. We further show the rationale of our proposed method by proving that our proposed framework is equivalent to minimize the evidence upper bound of the divergence between the distribution of task-specific features and its invariant ground truth derived from variational inference. We conduct extensive experiments to verify our method on three benchmarks, and both quantitative and qualitative results illustrate the effectiveness of our method.