Hybrid analog-digital (A/D) transceivers designed for millimeter wave (mmWave) systems have received substantial research attention, as a benefit of their lower cost and modest energy consumption compared to their fully-digital counterparts. We further improve their performance by conceiving a Tomlinson-Harashima precoding (THP) based nonlinear joint design for the downlink of multiuser multiple-input multiple-output (MIMO) mmWave systems. Our optimization criterion is that of minimizing the mean square error (MSE) of the system under channel uncertainties subject both to realistic transmit power constraint and to the unit modulus constraint imposed on the elements of the analog beamforming (BF) matrices governing the BF operation in the radio frequency domain. We transform this optimization problem into a more tractable form and develop an efficient block coordinate descent (BCD) based algorithm for solving it. Then, a novel two-timescale nonlinear joint hybrid transceiver design algorithm is developed, which can be viewed as an extension of the BCD-based joint design algorithm for reducing both the channel state information (CSI) signalling overhead and the effects of outdated CSI. Moreover, we determine the near-optimal cancellation order for the THP structure based on the lower bound of the MSE. The proposed algorithms can be guaranteed to converge to a Karush-Kuhn-Tucker (KKT) solution of the original problem. The simulation results demonstrate that our proposed nonlinear joint hybrid transceiver design algorithms significantly outperform the existing linear hybrid transceiver algorithms and approach the performance of the fully-digital transceiver, despite its lower cost and power dissipation.