Stochastic gradient methods (SGMs) are the predominant approaches to train deep learning models. The adapti