[abridged] How does a star cluster of more than few 10,000 solar masses form? We present the case of the cluster NGC 346 in the Small Magellanic Cloud, and its star-forming region N66, and we propose a scenario for its formation, based on observations of the rich stellar populations in the region. Young massive clusters (YMCs) host a high fraction of early-type stars, indicating an extremely high star formation efficiency. The Magellanic Clouds host a wide range of such clusters with the youngest being still embedded in their giant HII regions. Hubble Space Telescope imaging of such star-forming complexes allows the detailed study of star formation at scales typical for molecular clouds. Our cluster analysis of newly-born stars in N66 shows that star formation in the region proceeds in a clumpy hierarchical fashion, leading to the formation of both a dominant YMC, hosting about half of the observed pre--main-sequence population, and a dispersed self-similar distribution of the remaining stars. We investigate the correlation between star formation rate derived from star-counts and molecular gas surface density in order to unravel the physical conditions that gave birth to NGC 346. We find a steep correlation between these two parameters with a considerable scatter. The fraction of mass in stars is found to be systematically higher within the central 15 pc (where the YMC is located) than outside, which suggests variations in the star formation efficiency within the same star-forming complex. This trend possibly reflects a change of star formation efficiency in N66 between clustered and non-clustered star formation. Our findings suggest that the formation of NGC 346 is the combined result of star formation regulated by turbulence and of early dynamical evolution induced by the gravitational potential of the dense interstellar medium.