We present a methodology for ensuring the robustness of our analysis pipeline in separating the global 21-cm hydrogen cosmology signal from large systematics based on singular value decomposition (SVD) of training sets. We show how traditional goodness-of-fit metrics such as the $chi^2$ statistic that assess the fit to the full data may not be able to detect a suboptimal extraction of the 21-cm signal when it is fit alongside one or more additional components due to significant covariance between them. However, we find that comparing the number of SVD eigenmodes for each component chosen by the pipeline for a given fit to the distribution of eigenmodes chosen for synthetic data realizations created from training set curves can detect when one or more of the training sets is insufficient to optimally extract the signal. Furthermore, this test can distinguish which training set (e.g. foreground, 21-cm signal) needs to be modified in order to better describe the data and improve the quality of the 21-cm signal extraction. We also extend this goodness-of-fit testing to cases where a prior distribution derived from the training sets is applied and find that, in this case, the $chi^2$ statistic as well as the recently introduced $psi^2$ statistic are able to detect inadequacies in the training sets due to the increased restrictions imposed by the prior. Crucially, the tests described in this paper can be performed when analyzing any type of observations with our pipeline.