Conditional Generative Adversarial Nets

总结

在GAN的基础上加上了条件输入（对于D和G同时加）
目标函数改为$\underset {G} {min} \underset {D} {max}V(D,G) = E_{x \sim p_{data}(x)}[logD(x|y)] + E_{z \sim p_z(z)}[1 - log(D(G(z|y)))]$
如果没有对于条件的需求在实验中有CGAN的效果并不比GAN的效果好
CGAN可以按照条件输入输出特定的随机值
- 给定一个数字输出特定数字的手写图片
- 给定一个图片输出该图片的tag 概率分布，取较大者作为图片的tag，实现自动化tagging

Abstract

this net can be constructed by adding condition information to both generator and discriminator

Introduction

by conditioning the model on additional information it is possible to direct the data gener- ation process

present works problems:

hard to predict output categories
almost one-to-many mapping

solution:

first issue: leverage additional information from other modalities
second issue: use a conditional probabilistic generative model

Conditional Adversarial Nets

Generative Adversarial Nets

a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. Both G and D could be a non-linear mapping function, such as a multi-layer perceptron

Conditional Adversarial Nets

Generative adversarial nets can be extended to a conditional model if both the generator and discrim- inator are conditioned on some extra information y

objective function:

$\underset {G} {min} \underset {D} {max}V(D,G) = E_{x \sim p_{data}(x)}[logD(x|y)] + E_{z \sim p_z(z)}[1 - log(D(G(z|y)))]$

Experimental Results

Unimodal

trained a conditional adversarial net on MNIST images conditioned on their class labels, encoded as one-hot vectors

outperformed by several other approaches

Multimodal

use User-generated metadata on Flickr

Future Work

in the current experiments we only use each tag individually. But by using multiple tags at the same time

总结

Abstract

Introduction

Related Work

multi-modal learning for image labelling

Conditional Adversarial Nets

Generative Adversarial Nets

Conditional Adversarial Nets

Experimental Results

Unimodal

Multimodal

Future Work