abstract:0356b475066eb57b.tex

1: \begin{abstract}\noindent

2: Disparate treatment occurs when a machine learning model produces different decisions for groups defined by a legally protected or sensitive attribute (e.g., race, gender).

3: %

4: In domains where prediction accuracy is paramount, it is acceptable to fit a model which exhibits disparate treatment. %Here, a sensitive attribute may be used as an input to the model, or different models may be trained for different groups. %Motivated by this, a fundamental problem arises: does disparate treatment always benefit model performance?

5: %

6: We explore the effect of splitting classifiers (i.e., training and deploying a separate classifier on each group) and %prove that, %in the large sample regime, splitting never harms any group's accuracy. We strengthen this claim by deriving

7: derive an information-theoretic \emph{impossibility} result: there exists precise conditions where a group-blind classifier will \emph{always} have a non-trivial performance gap from the split classifiers.

8: We further demonstrate that, in the finite sample regime, splitting is no longer always beneficial and relies on the number of samples from each group and the complexity of the hypothesis class.

9: We provide data-dependent bounds for understanding the effect of splitting and illustrate these bounds on  real-world datasets.

10: \end{abstract}

11: