Using powerful posterior distributions is a popular approach to achieving better variational inference. However, recent works showed that the aggregated posterior may fail to match unit Gaussian prior, thus learning the prior becomes an alternative way to improve the lower-bound. In this paper, for the first time in the literature, we prove the necessity and effectiveness of learning the prior when aggregated posterior does not match unit Gaussian prior, analyze why this situation may happen, and propose a hypothesis that learning the prior may improve reconstruction loss, all of which are supported by our extensive experiment results. We show that using learned Real NVP prior and just one latent variable in VAE, we can achieve test NLL comparable to very deep state-of-the-art hierarchical VAE, outperforming many previous works with complex hierarchical VAE architectures.