Techniques exploiting the sparsity of images in a transform domain have been effective for various applications in image and video processing. Transform learning methods involve cheap computations and have been demonstrated to perform well in applications such as image denoising and medical image reconstruction. Recently, we proposed methods for online learning of sparsifying transforms from streaming signals, which enjoy good convergence guarantees, and involve lower computational costs than online synthesis dictionary learning. In this work, we apply online transform learning to video denoising. We present a novel framework for online video denoising based on high-dimensional sparsifying transform learning for spatio-temporal patches. The patches are constructed either from corresponding 2D patches in successive frames or using an online block matching technique. The proposed online video denoising requires little memory, and offers efficient processing. Numerical experiments compare the performance to the proposed video denoising scheme but fixing the transform to be 3D DCT, as well as prior schemes such as dictionary learning-based schemes, and the state-of-the-art VBM3D and VBM4D on several video data sets, demonstrating the promising performance of the proposed methods.