Consider a binary linear code of length $N$, minimum distance $d_{text{min}}$, transmission over the binary erasure channel with parameter $0 < epsilon < 1$ or the binary symmetric channel with parameter $0 < epsilon < frac12$, and block-MAP decoding. It was shown by Tillich and Zemor that in this case the error probability of the block-MAP decoder transitions quickly from $delta$ to $1-delta$ for any $delta>0$ if the minimum distance is large. In particular the width of the transition is of order $O(1/sqrt{d_{text{min}}})$. We strengthen this result by showing that under suitable conditions on the weight distribution of the code, the transition width can be as small as $Theta(1/N^{frac12-kappa})$, for any $kappa>0$, even if the minimum distance of the code is not linear. This condition applies e.g., to Reed-Mueller codes. Since $Theta(1/N^{frac12})$ is the smallest transition possible for any code, we speak of almost optimal scaling. We emphasize that the width of the transition says nothing about the location of the transition. Therefore this result has no bearing on whether a code is capacity-achieving or not. As a second contribution, we present a new estimate on the derivative of the EXIT function, the proof of which is based on the Blowing-Up Lemma.