We examine the performance of standard PMS stellar evolution models against the accurately measured properties of a benchmark sample of 26 PMS stars in 13 EB systems. We provide a definitive compilation of all fundamental properties for the EBs. We also provide a definitive compilation of the various PMS model sets. In the H-R diagram, the masses inferred for the individual stars by the models are accurate to better than 10% above 1 Msun, but below 1 Msun they are discrepant by 50-100%. We find evidence that the failure of the models to match the data is linked to the triples in the EB sample; at least half of the EBs possess tertiary companions. Excluding the triples, the models reproduce the stellar masses to better than ~10% in the H-R diagram, down to 0.5 Msun, below which the current sample is fully contaminated by tertiaries. We consider several mechanisms by which a tertiary might cause changes in the EB properties and thus corrupt the agreement with stellar model predictions. We show that the energies of the tertiary orbits are comparable to that needed to potentially explain the scatter in the EB properties through injection of heat, perhaps involving tidal interaction. It seems from the evidence at hand that this mechanism, however it operates in detail, has more influence on the surface properties of the stars than on their internal structure, as the lithium abundances are broadly in good agreement with model predictions. The EBs that are members of young clusters appear individually coeval to within 20%, but collectively show an apparent age spread of ~50%, suggesting true age spreads in young clusters. However, this apparent spread in the EB ages may also be the result of scatter in the EB properties induced by tertiaries. [Abridged]