2 May 2019
# The Variational Fair Autoencoder

Louizos et al, ICLR 2016, [link]

tags: *representation learning* - *iclr* - *2016*

The goal of this work is to propose a variational autoencoder based model that learns latent representations which are independent from some sensitive knowledge present in the data, while retaining enough information to solve the task at hand, e.g. classification. This independence constraint is incorporated via loss term based on Maximum Mean Discrepancy.

- Pros (+): Well justified, fast implementation trick, semi-supervised setting.
- Cons (-): requires explicit knowledge of the sensitive attribute.

Given input data , the goal is to learn a representation of , that factorizes out *nuisance* or *sensitive* variables, , while retaining task-relevant content . Working in the `VAE`

(Variational Autoencoder) framework, this is modeled as a generative process:

where the prior on latent is explicitly made invariant to the variables to filter out, . Introducing decoder , this model can be trained using the standard variational lower bound objective () [1].

In order to make the learned representations relevant to a specific task, the authors propose to incorporate label knowledge during the feature learning stage. This is particularly useful if the task target label is correlated with the sensitive information , otherwise unsupervised learning could yield random representations in order to get rid of only. In practice, this is done by considering *two distinct independent sources of information* for the latent: , the label for data point (categorical variable in the classification scenario) and a continuous variable that contains the remaining data variations which do not depend on (nor ). This yields the following generative process:

In the *fully supervised* setting, when the label is known, the ELBO can be extended to handle this *two-stage latent variables* scenario in a simple manner:

In the *semi-supervised* scenario, the label *y* can be missing for some of the samples. In which case, the ELBO can once again be extended to consider as another latent variable with encoder and prior .

So far, the model incorporates explicit statistical independence between the *sensitive* variables to protect, and the latent information to capture from the input data, . The authors additionally propose to regularize the marginal posterior using the Maximum Mean Discrepancy [2]: The `MMD`

is a distance between probability distributions that compare *mean statistics*, and is often combined with the kernel trick for efficient computation.
we want the latent representation to not learn any information about : This constraint can be enforced by making the distributions and close to one another, as measured by their `MMD`

. This introduces a second loss term to minimize, .

The is defined as a difference of mean statistics over two distributions. *Empirically*, given two sample sets , and a feature map operation , the squared distance can be written as

where the last line is the result of the *kernel trick* and is an arbitrary kernel function. In practice, the MMD loss term is computed over each batch: A naive implementation would require computing the full kernel matrix of for each pair of samples in the batch, where each scalar product requires operations, where is the dimensionality of . Instead, following [6], they use *Random Kitchen Sinks* [5] which is a low-rank method that allows to compute the MMD loss more efficiently.

The authors consider *three* experimental scenarios to test the proposed Variational Fair Autoencoder (`VFAE`

):

**Fairness:**Here the goal is to learn a classifier while making the latent representation independent from protected information , which is a certain attribute of the data (e.g., predict “income” independently of “age”). In all the datasets considered, is correlated with the ground-truth label , making the setting more challenging.**Domain adaptation:**Using the Amazon Reviews dataset, the task is to classify reviews as either positive or negative, coming from a supervised source domain and unsupervised target one, while being independent of the domain (i.e., book, dvd, electronics…)**Invariant representation:**Finally, the last task consists in face identity recognition while being explicitly invariant to various noise features (lighting, pose etc).

The models are evaluated on ** (i)** their performance on the target task and

On the **Fairness** task, the proposed method seems to be better at ignoring the sensitive information, although its trade-of between accuracy and fairness is not always the best. The main baseline is the model presented in [3], which also exploits a moment matching penalty term, although a simpler one, and does not use the `VAE`

framework.
In the **Domain Adaptation** task, the authors mostly compare to the `DANN`

[4] which proposes a simple adversarial training technique to align different domain representations. The `VFAE`

results are on-par, or even slightly better in most scenarios.

- [1] Autoencoding Variational Bayes,
*Kingma and Welling, ICLR 2014* - [2] A Kernel Method for the Two-Sample-Problem,
*Gretton et al, NeurIPS 2006* - [3] Learning Fair Representations,
*Zemel et al, ICML 2013* - [4] Domain-Adversarial Training of Neural Networks,
*Ganin et al., JMLR 2016* - [5] Random Features for Large-Scale Kernel Machines,
*Rahimi and Recht, NeurIPS 2007* - [6] FastMMD: Ensemble of Circular Discrepancy for Efficient Two-Sample Test,
*Zhao and Meng, Neural Computation 2015*