Future galaxy clustering surveys will probe small scales where non-linearities become important. Since the number of modes accessible on intermediate to small scales is very high, having a precise model at these scales is important especially in the context of discriminating alternative cosmological models from the standard one. In the mildly non-linear regime, such models typically differ from each other, and galaxy clustering data will become very precise on these scales in the near future. As the observable quantity is the angular power spectrum in redshift space, it is important to study the effects of non-linear density and redshift space distortion (RSD) in the angular power spectrum. We compute non-linear contributions to the angular power spectrum using a flat-sky approximation that we introduce in this work, and compare the results of different perturbative approaches with $N$-body simulations. We find that the TNS perturbative approach is significantly closer to the $N$-body result than Eulerian or Lagrangian 1-loop approximations, effective field theory of large scale structure or a halofit-inspired model. However, none of these prescriptions is accurate enough to model the angular power spectrum well into the non-linear regime. In addition, for narrow redshift bins, $Delta z lesssim 0.01$, the angular power spectrum acquires non-linear contributions on all scales, right down to $ell=2$, and is hence not a reliable tool at this time. To overcome this problem, we need to model non-linear RSD terms, for example as TNS does, but for a matter power spectrum that remains reasonably accurate well into the deeply non-linear regime, such as halofit.