总结
- 在GAN的基础上加上了条件输入(对于D和G同时加)
- 目标函数改为$\underset {G} {min} \underset {D} {max}V(D,G) = E_{x \sim p_{data}(x)}[logD(x|y)] + E_{z \sim p_z(z)}[1 - log(D(G(z|y)))]$
- 如果没有对于条件的需求在实验中有CGAN的效果并不比GAN的效果好
- CGAN可以按照条件输入输出特定的随机值
- 给定一个数字输出特定数字的手写图片
- 给定一个图片输出该图片的tag 概率分布,取较大者作为图片的tag,实现自动化tagging
Abstract
this net can be constructed by adding condition information to both generator and discriminator
Introduction
by conditioning the model on additional information it is possible to direct the data gener- ation process
Related Work
multi-modal learning for image labelling
present works problems:
- hard to predict output categories
- almost one-to-many mapping
solution:
- first issue: leverage additional information from other modalities
- second issue: use a conditional probabilistic generative model
Conditional Adversarial Nets
Generative Adversarial Nets
a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. Both G and D could be a non-linear mapping function, such as a multi-layer perceptron
Conditional Adversarial Nets
Generative adversarial nets can be extended to a conditional model if both the generator and discrim- inator are conditioned on some extra information y
objective function:
Experimental Results
Unimodal
trained a conditional adversarial net on MNIST images conditioned on their class labels, encoded as one-hot vectors
outperformed by several other approaches
Multimodal
use User-generated metadata on Flickr
Future Work
in the current experiments we only use each tag individually. But by using multiple tags at the same time