Accurate prediction of short-term OD Matrix (i.e. the distribution of passenger flows from various origins to destinations) is a crucial task in metro systems. It is highly challenging due to the constantly changing nature of many impacting factors and the real-time de- layed data collection problem. Recently, some deep learning-based models have been proposed for OD Matrix forecasting in ride- hailing and high way traffic scenarios. However, these models can not sufficiently capture the complex spatiotemporal correlation between stations in metro networks due to their different prior knowledge and contextual settings. In this paper we propose a hy- brid framework Multi-view TRGRU to address OD metro matrix prediction. In particular, it uses three modules to model three flow change patterns: recent trend, daily trend, weekly trend. In each module, a multi-view representation based on embedding for each station is constructed and fed into a transformer based gated re- current structure so as to capture the dynamic spatial dependency in OD flows of different stations by a global self-attention mecha- nism. Extensive experiments on three large-scale, real-world metro datasets demonstrate the superiority of our Multi-view TRGRU over other competitors.