Spatial statistics often involves Cholesky decomposition of covariance matrices. To ensure scalability to high dimensions, several recent approximations have assumed a sparse Cholesky factor of the precision matrix. We propose a hierarchical Vecchia approximation, whose conditional-independence assumptions imply sparsity in the Cholesky factors of both the precision and the covariance matrix. This remarkable property is crucial for applications to high-dimensional spatio-temporal filtering. We present a fast and simple algorithm to compute our hierarchical Vecchia approximation, and we provide extensions to non-linear data assimilation with non-Gaussian data based on the Laplace approximation. In several numerical comparisons, our methods strongly outperformed alternative approaches.