In this paper, based on the Akaike information criterion, root mean square error and robustness coefficient, a rational evaluation of various epidemic models/methods, including seven empirical functions, four statistical inference methods and five dynamical models, on their forecasting abilities is carried out. With respect to the outbreak data of COVID-19 epidemics in China, we find that before the inflection point, all models fail to make a reliable prediction. The Logistic function consistently underestimates the final epidemic size, while the Gompertzs function makes an overestimation in all cases. Towards statistical inference methods, the methods of sequential Bayesian and time-dependent reproduction number are more accurate at the late stage of an epidemic. And the transition-like behavior of exponential growth method from underestimation to overestimation with respect to the inflection point might be useful for constructing a more reliable forecast. Compared to ODE-based SIR, SEIR and SEIR-AHQ models, the SEIR-QD and SEIR-PO models generally show a better performance on studying the COVID-19 epidemics, whose success we believe could be attributed to a proper trade-off between model complexity and fitting accuracy. Our findings not only are crucial for the forecast of COVID-19 epidemics, but also may apply to other infectious diseases.