Air temperature (Ta) is an essential climatological component that controls and influences various earth surface processes. In this study, we make the first attempt to employ deep learning for Ta mapping mainly based on space remote sensing and ground station observations. Considering that Ta varies greatly in space and time and is sensitive to many factors, assimilation data and socioeconomic data are also included for a multi-source data fusion based estimation. Specifically, a 5-layers structured deep belief network (DBN) is employed to better capture the complicated and non-linear relationships between Ta and different predictor variables. Layer-wise pre-training process for essential features extraction and fine-tuning process for weight parameters optimization ensure the robust prediction of Ta spatio-temporal distribution. The DBN model was implemented for 0.01{deg} daily maximum Ta mapping across China. The ten-fold cross-validation results indicate that the DBN model achieves promising results with the RMSE of 1.996{deg}C, MAE of 1.539{deg}C, and R of 0.986 at the national scale. Compared with multiple linear regression (MLR), back-propagation neural network (BPNN) and random forest (RF) method, the DBN model reduces the MAE values by 1.340{deg}C, 0.387{deg}C and 0.222{deg}C, respectively. Further analysis on spatial distribution and temporal tendency of prediction errors both validate the great potentials of DBN in Ta estimation.