Discovering and isolating infected individuals is a cornerstone of epidemic control. Because many infectious diseases spread through close contacts, contact tracing is a key tool for case discovery and control. However, although contact tracing has been performed widely, the mathematical understanding of contact tracing has not been fully established and it has not been clearly understood what determines the efficacy of contact tracing. Here, we reveal that, compared with forward tracing---tracing to whom disease spreads, backward tracing---tracing from whom disease spreads---is profoundly more effective. The effectiveness of backward tracing is due to simple but overlooked biases arising from the heterogeneity in contacts. Using simulations on both synthetic and high-resolution empirical contact datasets, we show that even at a small probability of detecting infected individuals, strategically executed contact tracing can prevent a significant fraction of further transmissions. We also show that---in terms of the number of prevented transmissions per isolation---case isolation combined with a small amount of contact tracing is more efficient than case isolation alone. By demonstrating that backward contact tracing is highly effective at discovering super-spreading events, we argue that the potential effectiveness of contact tracing has been underestimated. Therefore, there is a critical need for revisiting current contact tracing strategies so that they leverage all forms of biases. Our results also have important consequences for digital contact tracing because it will be crucial to incorporate the capability for backward and deep tracing while adhering to the privacy-preserving requirements of these new platforms.