
Image translation
Generative adversarial networks (GANs) are the most notorious deep learning architectures. This is due to their capability to generate outputs from random noise input vectors. GAN has two networks: generator and discriminator. The job of the generator is to take a random vector as input and generate sample output data. The discriminator takes input from both the real data and faked data created by the generator. The job of the discriminator is to determine whether the input is coming from real data or a faked one from the generator. You can visualize the scenario, imagining the discriminator is a bank trying to distinguish between real and fake currency. At the same time, the generator is the fraud trying to pass fake currency to a counterfeit bank; generator and discriminator both learn through their mistakes, and the generator eventually produces results that imitate the real data very precisely.
One of the interesting applications of GANs is image-to-image translation. It is based on conditional GAN (we will be discussing GANs in details under Chapter 7). Given a pair of images holding some relation, say I1 and I2, a conditional GAN learns how to convert I1 into I2. A dedicated software called pix2pix is created to demonstrate the applications of this concept. It can be used to fill in colors to black and white images, create maps from satellite images, generate object images from mere sketches, and what not!
The following is the link to the actual paper published by Phillip Isola for image-to-image translation and a sample image from pix2pix depicting various applications of image-to-image translation (https://arxiv.org/abs/1611.07004):
