abstract:e8451d638999ad0c.tex

1: \begin{abstract}

2: Machine learning models have been criticized for reflecting unfair biases in the training data.

3: Instead of solving for this by introducing fair learning algorithms directly, we focus on generating fair synthetic data, such that any downstream learner is fair.

4: Generating fair synthetic data from unfair data--- \emph{while remaining truthful to the underlying data-generating process (DGP)} ---is non-trivial.

5: In this paper, we introduce DECAF: a GAN-based {\it fair} synthetic data generator for tabular data.

6: With DECAF we embed the DGP explicitly as a structural causal model in the input layers of the generator, allowing each variable to be reconstructed conditioned on its causal parents.

7: This procedure enables \textit{inference-time} debiasing, where biased edges can be strategically removed for satisfying user-defined fairness requirements.

8: The DECAF framework is versatile and compatible with several popular definitions of fairness.

9: In our experiments, we show that DECAF successfully removes undesired bias and--- in contrast to existing methods ---is capable of generating high-quality synthetic data.

10: Furthermore, we provide theoretical guarantees on the generator's convergence and the fairness of downstream models.

11: \end{abstract}

12: