Normalization • Resize/Crop • Flipping • Rotation • Label Encoding
Before training a deep learning model, our data must be clean, organized, and ready for learning.
Just like students learn better when notes are clear and well-arranged, neural networks also learn better when data is properly prepared.
This chapter explains four essential preprocessing and augmentation techniques: Normalization, Resize/Crop, Flipping & Rotation, and Label Encoding.
These methods help models learn faster, become more accurate, and avoid confusion during training.
1. Normalization
Normalization changes the scale of data so that all values fall within a similar range (like 0–1).
Neural networks learn better when numbers are balanced.
Simple Explanation
Imagine you have two features:
- Height in centimeters (150–180)
- Age (10–18)
Height values are much larger than age, so the model may mistakenly think height is more important.
Normalization fixes this by putting both on a similar scale.
Why Normalization Helps
- Speeds up training
- Prevents errors
- Helps the model focus fairly on all features
Image Normalization
Pixel values often range from 0–255.
Normalization converts them to 0–1.
Code Example
import numpy as np
# Normalizing image pixels
image = np.random.randint(0, 255, (28, 28))
normalized_image = image / 255.0
print(normalized_image[:2])2. Resize and Crop
Images come in different shapes and sizes.
Neural networks require images of the same size to train properly.
Resize
Adjusts the entire image to the required size, such as 224×224 pixels.
Crop
Removes unnecessary parts of the image and keeps the main area.
Why These Techniques Are Useful
- Makes training stable
- Reduces memory usage
- Improves accuracy
Real Example
A dataset of cat images may contain:
- Big close-up faces
- Full-body photos
- Side angles
Resizing and cropping make all images uniform.
Code Example (TensorFlow)
import tensorflow as tf
image = tf.random.uniform((300, 300, 3)) # random image
resized = tf.image.resize(image, (224, 224))
cropped = tf.image.central_crop(image, 0.7)
print(resized.shape)
print(cropped.shape)3. Flipping and Rotation (Data Augmentation)
Data Augmentation means creating new training samples by slightly changing original images.
This helps the model learn patterns more effectively.
Flipping
- Horizontal flip: mirror left ↔ right
- Vertical flip: mirror up ↔ down (less common)
Example: Mirrors of cats, cars, or faces.
Rotation
Rotates images by small angles (e.g., 10°, 20°).
Why Augmentation Helps
- Increases training data
- Prevents overfitting
- Makes the model stronger and more flexible
Real Example
If you take a selfie and flip it horizontally:
- It still looks like you
- But the model sees this as a new training example
Code Example
import tensorflow as tf
image = tf.random.uniform((200, 200, 3))
flipped = tf.image.flip_left_right(image)
rotated = tf.image.rot90(image) # rotate 90 degrees
print(flipped.shape, rotated.shape)4. Label Encoding
Neural networks cannot understand text labels like:
- "Cat"
- "Dog"
- "Horse"
Label Encoding converts text labels into numbers.
Example
| Animal | Encoded Label |
|---|---|
| Cat | 0 |
| Dog | 1 |
| Horse | 2 |
The model learns patterns from these numbers.
When Multi-Class Labels Are Needed
For more than two classes, One-Hot Encoding is used:
| Cat | Dog | Horse |
|---|---|---|
| 1 | 0 | 0 |
| 0 | 1 | 0 |
| 0 | 0 | 1 |
Code Example
from sklearn.preprocessing import LabelEncoder
labels = ["cat", "dog", "cat", "horse"]
encoder = LabelEncoder()
encoded_labels = encoder.fit_transform(labels)
print(encoded_labels)5. Complete Preprocessing Pipeline Example
This example shows how to combine preprocessing techniques for images.
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rescale=1./255, # normalization
rotation_range=20, # rotation
horizontal_flip=True, # flipping
width_shift_range=0.1, # small shifts
height_shift_range=0.1
)
train = datagen.flow_from_directory(
"dataset/train",
target_size=(224, 224), # resizing
batch_size=32,
class_mode="categorical"
)Explanation
- rescale=1./255 → Normalizes pixels to 0–1
- rotation_range → Rotates images
- horizontal_flip → Flips images
- target_size → Resizes images
- class_mode="categorical" → One-hot label encoding
6. Why Preprocessing & Augmentation Matter
These techniques:
- Improve model accuracy
- Reduce overfitting
- Allow the model to learn from fewer images
- Prepare messy real-world data for training
- Make the network more reliable
Without preprocessing, even a strong neural network will perform poorly.