Surface codes exploit topological protection to increase error resilience in quantum computing devices and can in principle be implemented in existing hardware. They are one of the most promising candidates for active error correction, not least due to a polynomial-time decoding algorithm which admits one of the highest predicted error thresholds. We consider the dependence of this threshold on underlying assumptions including different noise models, and analyze the performance of a minimum weight perfect matching (MWPM) decoding compared to a mathematically optimal maximum likelihood (ML) decoding. Our ML algorithm tracks the success probabilities for all possible corrections over time and accounts for individual gate failure probabilities and error propagation due to the syndrome measurement circuit. We present the very first evidence for the true error threshold of an optimal circuit level decoder, allowing us to draw conclusions about what kind of improvements are possible over standard MWPM.