Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds


Abstract in English

In domains such as criminal justice, medicine, and social welfare, decision makers increasingly have access to algorithmic Risk Assessment Instruments (RAIs). RAIs estimate the risk of an adverse outcome such as recidivism or child neglect, potentially informing high-stakes decisions such as whether to release a defendant on bail or initiate a child welfare investigation. It is important to ensure that RAIs are fair, so that the benefits and harms of such decisions are equitably distributed. The most widely used algorithmic fairness criteria are formulated with respect to observable outcomes, such as whether a person actually recidivates, but these criteria are misleading when applied to RAIs. Since RAIs are intended to inform interventions that can reduce risk, the prediction itself affects the downstream outcome. Recent work has argued that fairness criteria for RAIs should instead utilize potential outcomes, i.e. the outcomes that would occur in the absence of an appropriate intervention. However, no methods currently exist to satisfy such fairness criteria. In this paper, we target one such criterion, counterfactual equalized odds. We develop a post-processed predictor that is estimated via doubly robust estimators, extending and adapting previous post-processing approaches to the counterfactual setting. We also provide doubly robust estimators of the risk and fairness properties of arbitrary fixed post-processed predictors. Our predictor converges to an optimal fair predictor at fast rates. We illustrate properties of our method and show that it performs well on both simulated and real data.

Download