The inherent dynamics of the neuron membrane potential in Spiking Neural Networks (SNNs) allows processing of sequential learning tasks, avoiding the complexity of recurrent neural networks. The highly-sparse spike-based computations in such spatio-temporal data can be leveraged for energy-efficiency. However, the membrane potential incurs additional memory access bottlenecks in current SNN hardware. To that effect, we propose a 10T-SRAM compute-in-memory (CIM) macro, specifically designed for state-of-the-art SNN inference. It consists of a fused weight (WMEM) and membrane potential (VMEM) memory and inherently exploits sparsity in input spikes leading to 97.4% reduction in energy-delay-product (EDP) at 85% sparsity (typical of SNNs considered in this work) compared to the case of no sparsity. We propose staggered data mapping and reconfigurable peripherals for handling different bit-precision requirements of WMEM and VMEM, while supporting multiple neuron functionalities. The proposed macro was fabricated in 65nm CMOS technology, achieving an energy-efficiency of 0.99TOPS/W at 0.85V supply and 200MHz frequency for signed 11-bit operations. We evaluate the SNN for sentiment classification from the IMDB dataset of movie reviews and achieve within 1% accuracy of an LSTM network with 8.5x lower parameters.