πΈ EfficientNet-B4 Flower Classifier
A state-of-the-art image classification model for identifying 102 flower species from the Oxford Flowers-102 dataset.
Model Details
Model Description
This model is built on the EfficientNet-B4 backbone with a custom classifier head, trained using a novel 6-Phase Progressive Training strategy. The training progressively increases image resolution (280px β 400px) and augmentation difficulty (None β MixUp β CutMix β Hybrid).
- Developed by: fth2745
- Model type: Image Classification (CNN)
- License: MIT
- Finetuned from: EfficientNet-B4 (ImageNet pretrained)
Performance
| Metric | Test Set | Validation Set |
|---|---|---|
| Top-1 Accuracy | 94.49% | 97.45% |
| Top-3 Accuracy | 97.61% | 98.82% |
| Top-5 Accuracy | 98.49% | 99.31% |
| Macro F1-Score | 94.75% | 97.13% |
Training Details
Training Data
Oxford Flowers-102 dataset with offline data augmentation (tier-based augmentation for class balancing).
Training Procedure
6-Phase Progressive Training
| Phase | Epochs | Resolution | Augmentation | Dropout |
|---|---|---|---|---|
| 1. Basic | 1-5 | 280Γ280 | Basic Preprocessing | 0.4 |
| 2. MixUp Soft | 6-10 | 320Γ320 | MixUp Ξ±=0.2 | 0.2 |
| 3. MixUp Hard | 11-15 | 320Γ320 | MixUp Ξ±=0.4 | 0.2 |
| 4. CutMix Soft | 16-20 | 380Γ380 | CutMix Ξ±=0.2 | 0.2 |
| 5. CutMix Hard | 21-30 | 380Γ380 | CutMix Ξ±=0.5 | 0.2 |
| 6. Grand Finale | 31-40 | 400Γ400 | Hybrid | 0.2 |
Preprocessing
- Resize β RandomCrop β HorizontalFlip β Rotation (Β±20Β°) β Affine β ColorJitter β Normalize (ImageNet)
Training Hyperparameters
- Optimizer: AdamW
- Learning Rate: 1e-3
- Weight Decay: 1e-4
- Scheduler: CosineAnnealingWarmRestarts (T_0=5, T_mult=2)
- Loss: CrossEntropyLoss (label_smoothing=0.1)
- Batch Size: 8
- Training Regime: fp16 mixed precision (AMP)
π― 6-Phase Progressive Training
Phase 1 βββ Phase 2 βββ Phase 3 βββ Phase 4 βββ Phase 5 βββ Phase 6
280px 320px 320px 380px 380px 400px
None MixUp MixUp CutMix CutMix Hybrid
Ξ±=0.2 Ξ±=0.4 Ξ±=0.2 Ξ±=0.5 MixUp+Cut
Phase Details
| Phase | Epochs | Resolution | Technique | Alpha | Dropout | Purpose |
|---|---|---|---|---|---|---|
| 1οΈβ£ Basic | 1-5 | 280Γ280 | Basic Preprocessing | - | 0.4 | Learn fundamental features |
| 2οΈβ£ MixUp Soft | 6-10 | 320Γ320 | MixUp | 0.2 | 0.2 | Gentle texture blending |
| 3οΈβ£ MixUp Hard | 11-15 | 320Γ320 | MixUp | 0.4 | 0.2 | Strong texture mixing |
| 4οΈβ£ CutMix Soft | 16-20 | 380Γ380 | CutMix | 0.2 | 0.2 | Learn partial structures |
| 5οΈβ£ CutMix Hard | 21-30 | 380Γ380 | CutMix | 0.5 | 0.2 | Handle occlusions |
| 6οΈβ£ Grand Finale | 31-40 | 400Γ400 | Hybrid | 0.1-0.3 | 0.2 | Final polish with both |
π‘ Why Progressive Training? Starting with low resolution helps the model learn general shapes first. Gradual augmentation increase builds robustness incrementally.
πΌοΈ Preprocessing Pipeline (All Phases)
β οΈ Note: These preprocessing steps are applied in ALL PHASES. Only
img_sizechanges per phase.
Complete Training Flow
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π· RAW IMAGE INPUT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π STEP 1: IMAGE-LEVEL PREPROCESSING (Per image) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 1οΈβ£ Resize β (img_size + 32) Γ (img_size + 32) β
β 2οΈβ£ RandomCrop β img_size Γ img_size β
β 3οΈβ£ HorizontalFlip β p=0.5 β
β 4οΈβ£ RandomRotation β Β±20Β° β
β 5οΈβ£ RandomAffine β scale=(0.8, 1.2) β
β 6οΈβ£ ColorJitter β brightness, contrast, saturation=0.2 β
β 7οΈβ£ ToTensor β [0-255] β [0.0-1.0] β
β 8οΈβ£ Normalize β ImageNet mean/std β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π² STEP 2: BATCH-LEVEL AUGMENTATION (Phase-specific) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Phase 1: None (Preprocessing only) β
β Phase 2-3: MixUp (Ξ»ΓImageA + (1-Ξ»)ΓImageB) β
β Phase 4-5: CutMix (Patch swap between images) β
β Phase 6: Hybrid (MixUp + CutMix combined) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π― READY FOR MODEL TRAINING β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase-Specific Image Sizes
| Phase | img_size | Resize To | RandomCrop To |
|---|---|---|---|
| 1οΈβ£ Basic | 280 | 312Γ312 | 280Γ280 |
| 2οΈβ£ MixUp Soft | 320 | 352Γ352 | 320Γ320 |
| 3οΈβ£ MixUp Hard | 320 | 352Γ352 | 320Γ320 |
| 4οΈβ£ CutMix Soft | 380 | 412Γ412 | 380Γ380 |
| 5οΈβ£ CutMix Hard | 380 | 412Γ412 | 380Γ380 |
| 6οΈβ£ Grand Finale | 400 | 432Γ432 | 400Γ400 |
Preprocessing Details (All Phases)
| Step | Transform | Parameters | Purpose |
|---|---|---|---|
| 1οΈβ£ | Resize | (size+32, size+32) | Prepare for random crop |
| 2οΈβ£ | RandomCrop | (size, size) | Random position augmentation |
| 3οΈβ£ | RandomHorizontalFlip | p=0.5 | Left-right invariance |
| 4οΈβ£ | RandomRotation | degrees=20 | Rotation invariance |
| 5οΈβ£ | RandomAffine | scale=(0.8, 1.2) | Scale variation |
| 6οΈβ£ | ColorJitter | (0.2, 0.2, 0.2) | Brightness/Contrast/Saturation |
| 7οΈβ£ | ToTensor | - | Convert to PyTorch tensor |
| 8οΈβ£ | Normalize | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] | ImageNet normalization |
Test/Validation Preprocessing
| Step | Transform | Parameters |
|---|---|---|
| 1οΈβ£ | Resize | (size, size) |
| 2οΈβ£ | ToTensor | - |
| 3οΈβ£ | Normalize | ImageNet mean/std |
π‘ Key Insight: Preprocessing (8 steps) is applied per image in every phase. MixUp/CutMix is applied AFTER preprocessing as batch-level augmentation.
π Batch-Level Augmentation Techniques (Phase-Specific)
MixUp
Image A (Rose) + Image B (Sunflower)
β
Ξ» = Beta(Ξ±, Ξ±) β New Image = Ξ»ΓA + (1-Ξ»)ΓB
β
Blended Image (70% Rose + 30% Sunflower features)
Benefits: β Smoother decision boundaries β Reduces overconfidence β Better generalization
CutMix
Image A (Rose) + Random BBox from Image B (Sunflower)
β
Paste B's region onto A
β
Composite Image (Rose background + Sunflower patch)
Benefits: β Object completion ability β Occlusion robustness β Localization skills
Hybrid (Grand Finale)
- Apply MixUp (blend two images)
- Apply CutMix (cut on blended image)
- Result: Maximum augmentation challenge
π‘οΈ Smart Training Features
Two-Layer Early Stopping
| Layer | Condition | Patience | Action |
|---|---|---|---|
| Phase-level | Trainβ + Valβ (Overfitting) | 2 epochs | Skip to next phase |
| Global | Val loss not improving | 8 epochs | Stop training |
Smart Dropout Mechanism
| Signal | Condition | Action |
|---|---|---|
| β οΈ Overfitting | Trainβ + Valβ | Dropout += 0.05 |
| π Underfitting | Trainβ + Valβ | Dropout -= 0.05 |
| β Normal | Trainβ + Valβ | No change |
Bounds: min=0.10, max=0.50
Model Architecture
EfficientNet-B4 (pretrained)
βββ Custom Classifier Head
βββ BatchNorm1d (1792)
βββ Dropout
βββ Linear (1792 β 512)
βββ GELU
βββ BatchNorm1d (512)
βββ Dropout
βββ Linear (512 β 102)
Total Parameters: ~19M (all trainable)
Supported Flower Classes
102 flower species including: Rose, Sunflower, Tulip, Orchid, Lily, Daisy, Hibiscus, Lotus, Magnolia, and 93 more.
Limitations
- Trained only on Oxford Flowers-102 dataset
- Best performance at 400Γ400 resolution
- May not generalize well to flowers outside the 102 trained classes
Citation
@misc{efficientnet-b4-flowers102,
title={EfficientNet-B4 Flower Classifier with 6-Phase Progressive Training},
author={fth2745},
year={2024},
url={https://huggingface.co/fth2745/efficientnet-b4-flowers102}
}
- Downloads last month
- 18
Model tree for fth2745/efficientnet-b4-flowers102
Base model
google/efficientnet-b4