GANs • DCGAN • WGAN • CycleGAN • Image-to-Image Translation • Text-to-Image Models
Generative Models are deep learning models that can create new data.
They do not just classify or predict—they generate new images, new voices, new text, and even new videos.
These models learn the patterns of a dataset and then use those patterns to produce similar, realistic content.
This chapter covers GANs, DCGAN, WGAN, CycleGAN, Image-to-Image translation, and Text-to-Image models, all explained in clear and simple language.
1. Generative Adversarial Networks (GANs)
GANs are one of the most exciting inventions in deep learning.
They work using two neural networks that compete with each other:
- Generator → creates new images
- Discriminator → checks if images are real or fake
Simple Explanation
Imagine an art student (Generator) trying to paint fake currency notes, and a bank officer (Discriminator) trying to detect if a note is real or fake.
Both improve over time:
- The student learns to paint better fakes
- The officer learns to detect better
This competition helps GANs create very realistic images.
What GANs Can Do
- Create human faces that do not exist
- Generate cartoon characters
- Produce artwork
- Create new fashion designs
- Enhance old photos
Basic GAN Code Example (Keras)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Generator
generator = Sequential([
Dense(128, activation='relu', input_shape=(100,)),
Dense(784, activation='sigmoid')
])
# Discriminator
discriminator = Sequential([
Dense(128, activation='relu', input_shape=(784,)),
Dense(1, activation='sigmoid')
])2. DCGAN (Deep Convolutional GAN)
DCGANs use convolutional layers, making them very good at generating images.
Why DCGANs Are Better
- Generate sharper images
- Learn shapes and textures more clearly
- Use CNNs instead of simple dense layers
Real Example
DCGANs can create:
- Anime characters
- Realistic faces
- New fashion designs
- Rooms, buildings, and landscapes
3. WGAN (Wasserstein GAN)
WGANs solve the stability problems of normal GANs.
Simple Explanation
Regular GANs often:
- Collapse (produce same image again and again)
- Become unstable
- Stop learning
WGANs use a better mathematical system that:
- Stabilizes training
- Creates more diverse images
- Reduces mode collapse
Real Example
WGANs are used in:
- Medical imaging
- Scientific simulations
- High-resolution image generation
4. CycleGAN
CycleGANs are used for image-to-image translation without needing paired images.
What Does Image-to-Image Translation Mean?
It means turning one type of image into another while keeping the shape.
Examples of CycleGAN Magic
- Turning horses → zebras
- Changing summer → winter
- Converting photos → paintings
- Making day → night scenes
The key trick is "cycle consistency," meaning:
- A → B → A should be similar to the original image A
This helps the model learn transformations accurately.
5. Image-to-Image Translation
This technique converts one form of image into another.
GANs and specialized models are used to transform images.
Examples
- Sketch → Photo
- Black-and-white → Color
- Low-resolution → High-resolution
- Satellite → Map view
- CT scan → MRI scan
These models are used in:
- Art
- Medicine
- Gaming
- Fashion
- Cartoons
Code Example: Using Pix2Pix (TensorFlow)
import tensorflow as tf
from tensorflow_examples.models.pix2pix import pix2pix
generator = pix2pix.unet_generator(3, norm_type='batchnorm')
discriminator = pix2pix.discriminator(norm_type='batchnorm')6. Text-to-Image Models
Text-to-image models generate pictures from written descriptions.
Examples
- "A cat wearing sunglasses"
- "A castle on the moon"
- "A pineapple wearing headphones"
These models read the text, understand the meaning, and create an image.
Famous Text-to-Image Models
- DALL·E
- Stable Diffusion
- Midjourney
- Imagen (Google)
How They Work
- Convert words into vectors
- Understand relationships in the sentence
- Generate an image using a decoder
Real-World Applications
- Game design
- Marketing graphics
- Movie storyboarding
- AI art creation
- Product design
7. Why Generative Models Matter
Generative models are transforming many fields:
| Field | Usage |
|---|---|
| Art & Creativity | AI-generated paintings, music, designs |
| Medicine | Creating synthetic medical images |
| Education | Creating training data |
| Gaming | Auto-generated characters, textures |
| Security | Deepfake detection research |
| Film | CGI character generation |
They help create infinite new data, helping AI become smarter and more creative.
8. Simple GAN Training Loop Example (Pseudo-code)
for each epoch:
# Train Discriminator
real_images = get_real_images()
fake_images = generator(noise)
discriminator.train(real_images, labels_real)
discriminator.train(fake_images, labels_fake)
# Train Generator
generator.train(noise, labels_real) # try to fool discriminatorThis simple loop shows how the two networks compete.
9. Recap Table
| Model | Purpose | Example Output |
|---|---|---|
| GAN | Generate new data | Fake human faces |
| DCGAN | Use CNN for images | Anime characters |
| WGAN | Stable training | High-quality images |
| CycleGAN | Translate between styles | Horse ↔ Zebra |
| Text-to-Image | Create images from text | "A robot eating pizza" |