Huge overhead of beam training poses a significant challenge to mmWave communications. To address this issue, beam tracking has been widely investigated whereas existing methods are hard to handle serious multipath interference and non-stationary scenarios. Inspired by the spatial similarity between low-frequency and mmWave channels in non-standalone architectures, this paper proposes to utilize prior low-frequency information to predict the optimal mmWave beam, where deep learning is adopted to enhance the prediction accuracy. Specifically, periodically estimated low-frequency channel state information (CSI) is applied to track the movement of user equipment, and timing offset indicator is proposed to indicate the instant of mmWave beam training relative to low-frequency CSI estimation. Meanwhile, long-short term memory networks based dedicated models are designed to implement the prediction. Simulation results show that our proposed scheme can achieve higher beamforming gain than the conventional methods while requiring little overhead of mmWave beam training.