2018

VAE

Neural Discrete Representation Learning

Van den Oord et al., in NeurIPS 2017

In this work, the authors propose VQ-VAE, a variant of the Variational Autoencoder (VAE) framework with a discrete latent space, using ideas from vector quantization. The two main motivations are (i) discrete variables are potentially better fit to capture the structure of data such as text and (ii) to prevent the posterior collapse in VAEs that leads to latent variables being ignored when the decoder is too powerful.

adversarial

Domain Adversarial Training of Neural Networks

Ganin et al., in JMLR 2016

In this article, the authors tackle the problem of unsupervised domain adaptation: Given labeled samples from a source distribution `\mathcal D_S` and unlabeled samples from target distribution `\mathcal D_T`, the goal is to learn a function that solves the task for both the source and target domains. In particular, the proposed model is trained on both source and target data jointly, and aims to directly learn an aligned representation of the domains, while retaining meaningful information with respect to the source labels.

adversarial examples

Do Deep Generative Models Know what they don’t Know ?

Nalisnick et al., in ICLR 2019

CNNs' prediction landscapes are known to be very sensitive to adversarial examples, which are small perturbations of an image, indistinguishable to the human eye, that lead to wrong predictions with high confidence. On the other hand, probabilistic generative models such as PixelCNNs and VAEs are trained to model a distribution over the input images, thus could be used to detect out-of-distribution inputs by estimating their likelihood under the data distribution. This paper provides interesting results showing that distributions learned by generative models are not robust enough yet for such purposes.

Excessive Invariance Causes Adversarial Vulnerability

Jacobsen et al., in ICLR 2019

The authors introduce the notion of invariance-based adversarial examples, which can be seen as a generalization of adversarial examples at the feature level: Given an image `x` and a pretrained classifier, it is possible to generate an image that is both semantically plausible and distinct from `x`, and yet yields the exact same output logits under the classifier. The authors study this scenario in the context of invertible networks, e.g., i-RevNet [1].

architectures

A simple Neural Network Module for Relational Reasoning

Santoro et al., in NeurIPS 2017

The authors propose a relation module to equip CNN architectures with notion of relational reasoning, particularly useful for tasks such as visual question answering, dynamics understanding etc.

The Reversible Residual Network: Backpropagation Without Storing Activations

Gomez et al., in NeurIPS 2017

Residual Networks (ResNet) [3] have greatly advanced the state-of-the-art in Deep Learning by making it possible to train much deeper networks via the addition of skip connections. However, in order to compute gradients during the backpropagation pass, all the units' activations have to be stored during the feed-forward pass, leading to high memory requirements for these very deep networks. Instead, the authors propose a reversible architecture in which activations at one layer can be computed from the ones of the next. Leveraging this invertibility property, they design a more efficient implementation of backpropagation, effectively trading compute power for memory storage.

domain adaptation

Domain Adversarial Training of Neural Networks

Ganin et al., in JMLR 2016

In this article, the authors tackle the problem of unsupervised domain adaptation: Given labeled samples from a source distribution `\mathcal D_S` and unlabeled samples from target distribution `\mathcal D_T`, the goal is to learn a function that solves the task for both the source and target domains. In particular, the proposed model is trained on both source and target data jointly, and aims to directly learn an aligned representation of the domains, while retaining meaningful information with respect to the source labels.

Domain Generalization with Adversarial Feature Learning

Li et al., in CVPR 2018

In this paper, the authors tackle the problem of Domain Generalization: Given multiple source domains, the goal is to learn a joint aligned feature representation, hoping it would generalize to a new unseen target domain. This is closely related to the Domain Adaptation task, with the difference that no target data (even unlabeled) is available at training time. Most approaches rely on the idea of aligning the source domains distributions in a shared space. In this work, the authors propose to additionally match the source distributions to a known prior distribution.

domain generalization

Automatically Composing Representation Transformations as a Mean for Generalization

Chang et al., in ICLR 2019

The authors focus on solving recursive tasks which can be decomposed into a sequence of simpler algorithmic procedures (e.g., arithmetic problems, geometric transformations). The main difficulties of this approach are (i) how to actually decompose the task into simpler blocks and (ii) how to extrapolate to more complex problems from learning on simpler individual tasks. The authors propose the compositional recursive learner (CRL) to learn at the same time both the structure of the task and its components.

Domain Generalization with Adversarial Feature Learning

Li et al., in CVPR 2018

In this paper, the authors tackle the problem of Domain Generalization: Given multiple source domains, the goal is to learn a joint aligned feature representation, hoping it would generalize to a new unseen target domain. This is closely related to the Domain Adaptation task, with the difference that no target data (even unlabeled) is available at training time. Most approaches rely on the idea of aligning the source domains distributions in a shared space. In this work, the authors propose to additionally match the source distributions to a known prior distribution.

few-shot learning

From Red Wine to Red Tomato: Composition with Context

Misra et al., in CVPR 2017

In this paper, the authors tackle the problem of learning classifiers of visual concepts that can also adapt to new concept compositions at test time. The main difficulty is that visual concept can different depending on the context, i.e., depending on the concepts they are combined with. For instance the red in "red tomato" is different from the one in "red wine". This work emphasizes the notion of visual concepts as composition units, rather than the usual paradigm of directly learning from large exhaustive datasets.

LaSO: Label-Set Operations Networks for Multi-label Few-Shot Learning

Alfassy et al., in CVPR 2019

In this paper, the authors tackle the problem of "multi-label few-shot learning", in which a multi-label classifier is trained with few samples of each object category, and is applied on images that contain potentially new combinations of the categories of interest. The key idea of the paper is to synthesize new samples at the feature-level by mirroring set operations (union, intersection, subtraction), hoping to the train/test distribution shift and improve the model's generalization abilities.

generative models

A Style-Based Generator Architecture for Generative Adversarial Networks

Karras et al., in CVPR 2019

In this work, the authors propose VQ-VAE, a variant of the Variational Autoencoder (VAE) framework with a discrete latent space, using ideas from vector quantization. The two main motivations are (i) discrete variables are potentially better fit to capture the structure of data such as text and (ii) to prevent the posterior collapse in VAEs that leads to latent variables being ignored when the decoder is too powerful.

Deep Image Prior

Ulyanov et al., in CVPR 2018

Deep Neural Networks are widely used in image generation tasks for capturing a general prior on natural images from a large set of observations. However, this paper shows that the structure of the network itself is able to capture a good prior, at least for local cues of image statistics. More precisely, a randomly initialized convolutional neural network can be a good handcrafted prior for low-level tasks such as denoising, inpainting.

Glow: Generative Flow with Invertible 1×1 Convolutions

D. Kingma and P. Dhariwal, in NeurIPS 2018

Invertible flow based generative models such as [2, 3] have several advantages including exact likelihood inference process (unlike VAEs or GANs) and easily parallelizable training and inference (unlike the sequential generative process in auto-regressive models). This paper proposes a new, more flexible, form of invertible flow for generative models, which builds on [3].

Do Deep Generative Models Know what they don’t Know ?

Nalisnick et al., in ICLR 2019

CNNs' prediction landscapes are known to be very sensitive to adversarial examples, which are small perturbations of an image, indistinguishable to the human eye, that lead to wrong predictions with high confidence. On the other hand, probabilistic generative models such as PixelCNNs and VAEs are trained to model a distribution over the input images, thus could be used to detect out-of-distribution inputs by estimating their likelihood under the data distribution. This paper provides interesting results showing that distributions learned by generative models are not robust enough yet for such purposes.

Neural Discrete Representation Learning

Van den Oord et al., in NeurIPS 2017

In this work, the authors propose VQ-VAE, a variant of the Variational Autoencoder (VAE) framework with a discrete latent space, using ideas from vector quantization. The two main motivations are (i) discrete variables are potentially better fit to capture the structure of data such as text and (ii) to prevent the posterior collapse in VAEs that leads to latent variables being ignored when the decoder is too powerful.

graphical models

Learning a SAT Solver from Single-Bit Supervision

Selsam et al., in ICLR 2019

The goal is to solve SAT problems with weak supervision: In that case, a model is trained only to predict the satisfiability of a formula in conjunctive normal form. As a byproduct, if the formula is satisfiable, an actual satisfying assignment can be worked out from the network's activations in most cases.

Conditional Neural Processes

Garnelo et al., in ICML 2018

Gaussian Processes are models that consider a family of functions (typically under a Gaussian distribution) and aim to quickly fit one of these functions at test time based on some observations. In that sense there are orthogonal to Neural Networks which instead aim to learn one function based on a large training set and hoping it generalizes well on any new unseen test input. This work is an attempt at bridging both approaches.

icml

Conditional Neural Processes

Garnelo et al., in ICML 2018

Gaussian Processes are models that consider a family of functions (typically under a Gaussian distribution) and aim to quickly fit one of these functions at test time based on some observations. In that sense there are orthogonal to Neural Networks which instead aim to learn one function based on a large training set and hoping it generalizes well on any new unseen test input. This work is an attempt at bridging both approaches.

image analysis

Deep Image Prior

Ulyanov et al., in CVPR 2018

Deep Neural Networks are widely used in image generation tasks for capturing a general prior on natural images from a large set of observations. However, this paper shows that the structure of the network itself is able to capture a good prior, at least for local cues of image statistics. More precisely, a randomly initialized convolutional neural network can be a good handcrafted prior for low-level tasks such as denoising, inpainting.

image compression

Neural Discrete Representation Learning

Van den Oord et al., in NeurIPS 2017

In this work, the authors propose VQ-VAE, a variant of the Variational Autoencoder (VAE) framework with a discrete latent space, using ideas from vector quantization. The two main motivations are (i) discrete variables are potentially better fit to capture the structure of data such as text and (ii) to prevent the posterior collapse in VAEs that leads to latent variables being ignored when the decoder is too powerful.

logics

Learning a SAT Solver from Single-Bit Supervision

Selsam et al., in ICLR 2019

The goal is to solve SAT problems with weak supervision: In that case, a model is trained only to predict the satisfiability of a formula in conjunctive normal form. As a byproduct, if the formula is satisfiable, an actual satisfying assignment can be worked out from the network's activations in most cases.

representation learning

Domain Adversarial Training of Neural Networks

Ganin et al., in JMLR 2016

In this article, the authors tackle the problem of unsupervised domain adaptation: Given labeled samples from a source distribution `\mathcal D_S` and unlabeled samples from target distribution `\mathcal D_T`, the goal is to learn a function that solves the task for both the source and target domains. In particular, the proposed model is trained on both source and target data jointly, and aims to directly learn an aligned representation of the domains, while retaining meaningful information with respect to the source labels.

The Variational Fair Autoencoder

Louizos et al., in ICLR 2016

The goal of this work is to propose a variational autoencoder based model that learns latent representations which are independent from some sensitive knowledge present in the data, while retaining enough information to solve the task at hand, e.g. classification. This independence constraint is incorporated via loss term based on Maximum Mean Discrepancy.

InfoVAE: Balancing Learning and Inference in Variational Autoencoders

Zhao et al., in AAAI 2019

Two known shortcomings of VAEs are that (i) The variational bound (ELBO) can lead to poor approximation of the true likelihood and inaccurate models and (ii) the model can ignore the learned latent representation when the decoder is too powerful. In this work, the author propose to tackle these problems by adding an explicit mutual information term to the standard VAE objective.

Gradient Reversal Against Discrimination

E. Raff and J. Sylvester, in DSAA 2018

In this work, the authors tackle the problem of learning fair representations, i.e. representations that should be insensitive to some given sensitive attribute, while retaining enough information to solve the task at hand. Given some input data `x` and attribute `a_p`, the task is to predict label `y` from `x` while making the attribute `a_p` protected, in other words, such that predictions are invariant to changes in `a_p`.

reversible networks

Glow: Generative Flow with Invertible 1×1 Convolutions

D. Kingma and P. Dhariwal, in NeurIPS 2018

Invertible flow based generative models such as [2, 3] have several advantages including exact likelihood inference process (unlike VAEs or GANs) and easily parallelizable training and inference (unlike the sequential generative process in auto-regressive models). This paper proposes a new, more flexible, form of invertible flow for generative models, which builds on [3].

The Reversible Residual Network: Backpropagation Without Storing Activations

Gomez et al., in NeurIPS 2017

Residual Networks (ResNet) [3] have greatly advanced the state-of-the-art in Deep Learning by making it possible to train much deeper networks via the addition of skip connections. However, in order to compute gradients during the backpropagation pass, all the units' activations have to be stored during the feed-forward pass, leading to high memory requirements for these very deep networks. Instead, the authors propose a reversible architecture in which activations at one layer can be computed from the ones of the next. Leveraging this invertibility property, they design a more efficient implementation of backpropagation, effectively trading compute power for memory storage.

Excessive Invariance Causes Adversarial Vulnerability

Jacobsen et al., in ICLR 2019

The authors introduce the notion of invariance-based adversarial examples, which can be seen as a generalization of adversarial examples at the feature level: Given an image `x` and a pretrained classifier, it is possible to generate an image that is both semantically plausible and distinct from `x`, and yet yields the exact same output logits under the classifier. The authors study this scenario in the context of invertible networks, e.g., i-RevNet [1].

robustness

Do Deep Generative Models Know what they don’t Know ?

Nalisnick et al., in ICLR 2019

CNNs' prediction landscapes are known to be very sensitive to adversarial examples, which are small perturbations of an image, indistinguishable to the human eye, that lead to wrong predictions with high confidence. On the other hand, probabilistic generative models such as PixelCNNs and VAEs are trained to model a distribution over the input images, thus could be used to detect out-of-distribution inputs by estimating their likelihood under the data distribution. This paper provides interesting results showing that distributions learned by generative models are not robust enough yet for such purposes.

structured learning

Learning a SAT Solver from Single-Bit Supervision

Selsam et al., in ICLR 2019

The goal is to solve SAT problems with weak supervision: In that case, a model is trained only to predict the satisfiability of a formula in conjunctive normal form. As a byproduct, if the formula is satisfiable, an actual satisfying assignment can be worked out from the network's activations in most cases.

Conditional Neural Processes

Garnelo et al., in ICML 2018

Gaussian Processes are models that consider a family of functions (typically under a Gaussian distribution) and aim to quickly fit one of these functions at test time based on some observations. In that sense there are orthogonal to Neural Networks which instead aim to learn one function based on a large training set and hoping it generalizes well on any new unseen test input. This work is an attempt at bridging both approaches.

visual reasoning

The Neuro-Symbolic Concept Learner

Mao et al., in ICLR 2019

Here the authors tackle the problem of Visual Question Answering. They propose to learn jointly from visual representations and text-level knowledge (question-answer pairs). They further make use of (i) curriculum learning and (ii) a differentiable symbolic solver for reasoning.

Deep Visual Analogy Making

Reed et al., in NeurIPS 2015

In this paper, the authors propose to learn visual analogies akin to the semantic and synctatic analogies naturally emerging in the Word2Vec embedding [1]: More specifically hey tackle the joint task of inferring a transformation from a given (source, target) pair, and applying the same relation to a new source image.

Measuring Abstract Reasoning in Neural Networks

Barrett et al., in ICML 2018

The authors introduce a new visual analogy dataset with the aim to analyze the reasoning abilities of ConvNets on higher abstract reasoning tasks such as small IQ tests.