Spectral distortions of the cosmic microwave background (CMB) provide a unique tool for learning about the early phases of cosmic history, reaching deep into the primordial Universe. At redshifts $z<10^6$, thermalization processes become inefficient and existing limits from COBE/FIRAS imply that no more than $Delta rho/rho<6times 10^{-5}$ (95% c.l.) of energy could have been injected into the CMB. However, at higher redshifts, when thermalization is efficient, the constraint weakens and $Delta rho/rho simeq 0.01-0.1$ could in principle have occurred. Existing computations for the evolution of distortions commonly assume $Delta rho/rho ll 1$ and thus become inaccurate in this case. Similarly, relativistic temperature corrections become relevant for large energy release, but have previously not been modeled as carefully. Here we study the evolution of distortions and the thermalization process after single large energy release at $z>10^5$. We show that for large distortions the thermalization efficiency is significantly reduced and that the distortion visibility is sizeable to much earlier times. This tightens spectral distortions constraints on low-mass primordial black holes with masses $M_{rm PBH} < 6times 10^{11}$ g. Similarly, distortion limits on the amplitude of the small-scale curvature power spectrum at wavenumbers $k>10^4,{rm Mpc}^{-1}$ and short-lived decaying particles with lifetimes $t_X< 10^7$ s are tightened, however, these still require a more detailed time-dependent treatment. We also briefly discuss the constraints from measurements of the effective number of relativistic degrees of freedom and light element abundances and how these complement spectral distortion limits.