We present a study of the consequences of an initial mass function that is stochastically sampled on the main emission lines used for gas-phase metallicity estimates in extra-galactic sources. We use the stochastic stellar population code SLUG and the photoionisation code Cloudy to show that the stochastic sampling of the massive end of the mass function can lead to clear variations in the relative production of energetic emission lines such as [OIII] relative to that of Balmer lines. We use this to study the impact on the Te, N2O2, R23 and O3N2 metallicity calibrators. We find that stochastic sampling of the IMF leads to a systematic over-estimate of O/H in galaxies with low star formation rates (< $10^{-3}$ M$_odot$/yr) when using the N2O2, R23 and O3N2 strong-line methods, and an under-estimate when using the Te method on galaxies of sub-solar metallicity. We point out that while the SFR(Ha)-to-SFR(UV) ratio can be used to identify systems where the initial mass function might be insufficiently sampled, it does not provide sufficient information to fully correct the metallicity calibrations at low star formation rates. Care must therefore be given in the choice of metallicity indicators in such systems, with the N2O2 indicator proving most robust of those tested by us, with a bias of 0.08 dex for models with SFR = $10^{-4}$ M$_odot$/yr and solar metallicity.