The abundance of massive galaxy clusters is a powerful probe of departures from General Relativity (GR) on cosmic scales. Despite current stringent constraints placed by stellar and galactic tests, on larger scales alternative theories of gravity such as $f(R)$ can still work as effective theories. Here we present constraints on two popular models of $f(R)$, Hu-Sawicki and designer, derived from a fully self-consistent analysis of current samples of X-ray selected clusters and accounting for all the covariances between cosmological and astrophysical parameters. Using cluster number counts in combination with recent data from the cosmic microwave background (CMB) and the CMB lensing potential generated by large scale structures, as well as with other cosmological constraints on the background expansion history and its mean matter density, we obtain the upper bounds $log_{10}|f_{R0}| < 4.79$ and $log_{10}B_0 < 3.75$ at the 95.4 per cent confidence level, for the Hu-Sawicki (with $n=1$) and designer models, respectively. The robustness of our results derives from high quality cluster growth data for the most massive clusters known out to redshifts $z sim 0.5$, a tight control of systematic uncertainties including an accurate and precise mass calibration from weak gravitational lensing data, and the use of the full shape of the halo mass function over the mass range of our data.