Taxi arrival time prediction is an essential part of building intelligent transportation systems. Traditional arrival time estimation methods mainly rely on traffic map feature extraction, which can not model complex situations and nonlinear spatial and temporal relationships. Therefore, we propose a Multi-View Spatial-Temporal Model (MVSTM) to capture the dependence of spatial-temporal and trajectory. Specifically, we use graph2vec to model the spatial view, dual-channel temporal module to model the trajectory view, and structural embedding to model the traffic semantics. Experiments on large-scale taxi trajectory data show that our approach is more effective than the novel method. The source code can be obtained from https://github.com/775269512/SIGSPATIAL-2021-GISCUP-4th-Solution.