We identify the dominant computational cost within the recently introduced stochastic and internally contracted FCIQMC-NEVPT2 method for large active space sizes. This arises from the contribution to the four-body intermediates arising from low-excitation level sampled determinant pairs. We develop an effective way to mitigate this cost via an additional stochastic step within the sampling of the required NEVPT2 intermediates. We find this systematically improvable additional sampling can reduce simulation time by 80% without introducing appreciable error. This saving is expected to increase for larger active spaces. We combine this enhanced sampling scheme with full stochastic orbital optimization for the first time, and apply it to find FCIQMC-NEVPT2 energies for spin states of an iron porphyrin system within (24,24) active spaces with relatively meagre computational resources. This active space size can now be considered as routine for NEVPT2 calculations of strongly correlated molecular systems within this improved stochastic methodology.