This paper addresses the problem of making statistical inference about a population that can only be identified through classifier predictions. The problem is motivated by scientific studies in which human labels of a population are replaced by a classifier. For downstream analysis of the population based on classifier predictions to be sound, the predictions must generalize equally across experimental conditions. In this paper, we formalize the task of statistical inference using classifier predictions, and propose bootstrap procedures to allow inference with a generalizable classifier. We demonstrate the performance of our methods through extensive simulations and a case study with live cell imaging data.