The Terahertz band is envisioned to meet the demanding 100 Gbps data rates for 6G wireless communications. Aiming at combating the distance limitation problem with low hardware-cost, ultra-massive MIMO with hybrid beamforming is promising. However, relationships among wavelength, array size and antenna spacing give rise to the inaccuracy of planar-wave channel model (PWM), while an enlarged channel matrix dimension leads to excessive parameters of applying spherical-wave channel model (SWM). Moreover, due to the adoption of hybrid beamforming, channel estimation (CE) needs to recover high-dimensional channels from severely compressed channel observation. In this paper, a hybrid spherical- and planar-wave channel model (HSPM) is investigated and proved to be accurate and efficient by adopting PWM within subarray and SWM among subarray. Furthermore, a two-phase HSPM CE mechanism is developed. A deep convolutional-neural-network (DCNN) is designed in the first phase for parameter estimation of reference subarrays, while geometric relationships of the remaining channel parameters between reference subarrays are leveraged to complete CE in the second phase. Extensive numerical results demonstrate the HSPM is accurate at various communication distances, array sizes and carrier frequencies.The DCNN converges fast and achieves high accuracy with 5.2 dB improved normalized-mean-square-error compared to literature methods, and owns substantially low complexity.