In this paper, we present a novel approach, Momentum$^2$ Teacher, for student-teacher based self-supervised learning. The approach performs momentum update on both network weights and batch normalization (BN) statistics. The teachers weight is a momentum update of the student, and the teachers BN statistics is a momentum update of those in history. The Momentum$^2$ Teacher is simple and efficient. It can achieve the state of the art results (74.5%) under ImageNet linear evaluation protocol using small-batch size(eg, 128), without requiring large-batch training on special hardware like TPU or inefficient across GPU operation (eg, shuffling BN, synced BN). Our implementation and pre-trained models will be given on GitHubfootnote{https://github.com/zengarden/momentum2-teacher}.