上QQ阅读APP看书，第一时间看更新

Generative adversarial networks

GANs are evolving rather quickly, and are receiving a considerable amount of attention from the research community. Yann LeCun's comment, expressing that the GAN framework is the most interesting idea in the last 10 years of machine learning, shows evidence of the perceived importance of the framework.

The following is a figure representing applications of the GAN framework:

Source: Generative Adversarial Nets ( https://arxiv.org/abs/1406.2661)

The GAN framework has been widely used to generate data from many domains. Examples of data generation with GANs include text-to-image synthesis, image super-resolution, and symbolic music generation. In addition to data generation, the GAN framework has also been used for unsupervised feature learning.

GANs were first described in the landmark paper Generative Adversarial Nets by Ian Goodfellow published in 2014. The setup of the framework uses an adversarial process to estimate the parameters of two models by iteratively and concomitantly training a discriminator network and a generator net .

One of the main advantages of GANs is that, unlike other approaches that use approximation methods to compute intractable functions or inference, such as VAE, GANs do not require an approximation method.

Informally speaking, the discriminator network plays the role of an investigator that learns to distinguish between samples that are real (samples that come from the distribution that generates the real data), and samples that are fake (samples produced by the generator). The generator network plays the role of a counterfeiter that uses feedback from the discriminator to learn how to produce samples that are capable of fooling the discriminator. This informal description already calls attention to the following potential issues with the GAN framework:

- A weak investigator might be easily fooled by the generator.
- An investigator without enough capacity might not learn to distinguish the data properly.
- An investigator that disregards variety can be fooled with a single example.

GANs have been successfully applied to many domains. In computer vision, noticeable applications of GAN include progressive growing of GANs and pix2pixHD, which we will learn to implement in this book. The following diagram illustrates our description of the GAN framework, and is the simplest base of a GAN implementation:

The early paper Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks by Alec Radford et al. has many exciting applications of GANs. In addition to using a trained discriminator for image classification tasks that shows competitive performance with other unsupervised algorithms, they show that the generator has interesting vector arithmetic properties that allows for manipulation of semantic qualities. The following figure from Alec Radford's paper shows an application of GAN vector arithmetic:

Source: Unsupervised Representation Learning with Deep Convolutional GANs ( https://arxiv.org/abs/1511.06434)

Despite of its recency, many variations of the GAN framework have been proposed, including energy-based GANs, boundary equilibrium GANs, mix-GANs, least-squares GANs, Wasserstein GANs, Wasserstein GANs with gradient penalty, and relativistic GANs.

Interestingly, in a recent paper called Are GANs created equal? A large-scale study, researchers from Google Brain performed experiments with multiple GAN variations and VAEs, not including relativistic GANs or reversible flow models, and claimed that there was no evidence that the variations were superior to the first GAN formulation proposed in Generative Adversarial Nets. This claim clearly suggests that GAN research and evaluation should be based on systematic and objective evaluation.

Progressive growing of GANs has been state-of-the-art in image generation using GANs. Progressive growing of GANs allows the generation of 1,024 by 1,024 high-resolution images. In the following figure, we provide a selection of celebrity faces generated with progressive growing of GANs:

Source: Progressive Growing of GANs ( https://arxiv.org/abs/1710.10196)

Another impressive development in GANs is pix2pixHD. This is a method for high-resolution (for example, 2048 x 1024) photorealistic image-to-image translation. The method has been used to synthesize portraits from face label maps and turning semantic label maps into photo realistic images. The following is an example of an input label:

Source: pix2pixHD project page (https://tcwang0509.github.io/pix2pixHD/)

The following is the corresponding synthesized figure:

Source: pix2pixHD project page (https://tcwang0509.github.io/pix2pixHD/)