🎯 Fake Image Detection Ensemble (9 Models)

A powerful ensemble of 9 specialized models trained for detecting fake/AI-generated images using single-class anomaly detection. Trained only on real images to learn what "normal" looks like, then detects fakes as anomalies.

πŸ“Š Performance

Metric Score
Accuracy 67.05%
Precision 87.97%
Recall 39.50%
F1 Score 54.52%

Confusion Matrix

  • True Negatives: 946 (real correctly identified)
  • False Positives: 54 (real misclassified as fake)
  • False Negatives: 605 (fake misclassified as real)
  • True Positives: 395 (fake correctly identified)

πŸ—οΈ Architecture

The ensemble combines 9 specialized models using different detection strategies:

Deep Learning Models (3):

  1. Enhanced Frequency VAE - Multi-scale frequency analysis with phase information

    • Uses both magnitude and phase of FFT
    • Spectral consistency loss
    • Detects frequency-domain artifacts
  2. Edge Normalizing Flow - Probability density estimation on edge features

    • Multi-scale edge analysis
    • Normalizing flow architecture
    • Detects unnatural edge patterns
  3. Semantic Deep SVDD - ResNet50-based hypersphere anomaly detection

    • Semantic feature extraction
    • One-class deep learning
    • Detects high-level semantic anomalies

Traditional ML Models (6):

  1. Texture One-Class SVM - Boundary-based detection

    • Enhanced texture features
    • RBF kernel
    • Tight decision boundary (nu=0.03)
  2. Isolation Forest - Isolation-based anomaly detection

    • 200 estimators
    • Frequency + spatial features
    • Fast inference
  3. Local Outlier Factor - Local density anomalies

    • Multi-scale patch analysis
    • Novelty detection mode
    • 20 neighbors
  4. Gaussian Mixture Model - Distribution modeling

    • 10 components
    • Full covariance
    • Color distribution analysis
  5. Color Distribution Model - Statistical color analysis

    • RGB histograms
    • Mahalanobis distance
    • Color moment analysis
  6. Statistical Model - Edge and color statistics

    • Sobel edge detection
    • Multi-scale analysis
    • Mahalanobis distance

πŸŽ“ Training Details

  • Training Data: 30,000 real images from COCO dataset
  • Training Approach: Single-class anomaly detection (NO fake images used)
  • Validation Split: 20% (6,000 images)
  • Test Set: 1,000 real + 1,000 fake images (completely separate)
  • Training Time: ~5-6 hours on GPU
  • Ensemble Method: Weighted voting with adaptive threshold

Model Training Times (Extended):

  • Enhanced Frequency VAE: 45 minutes
  • Texture One-Class SVM: 45 minutes
  • Color Distribution Model: 30 minutes
  • Edge Normalizing Flow: 45 minutes
  • Semantic Deep SVDD: 45 minutes
  • Statistical Model: 30 minutes
  • Isolation Forest: 30 minutes
  • Local Outlier Factor: 35 minutes
  • Gaussian Mixture Model: 30 minutes

πŸš€ Quick Start

import torch
from torchvision import transforms
from PIL import Image
import pickle
import json
from huggingface_hub import hf_hub_download

# Configuration
repo_id = "ash12321/fake-image-detection-ensemble"
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Download and load config
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
with open(config_path, 'r') as f:
    config = json.load(f)

# Load models (you need the model class definitions)
# Example for one model:
vae_path = hf_hub_download(repo_id=repo_id, filename="freq_vae.pth")
# freq_vae = EnhancedFreqVAE()
# freq_vae.load_state_dict(torch.load(vae_path, map_location=device))
# freq_vae.to(device)

# Load all other models similarly...

# Predict on new image
img = Image.open('test_image.jpg')
img = img.resize((256, 256), Image.LANCZOS).convert('RGB')

tfm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225])
])
img_tensor = tfm(img)

# Get prediction from ensemble
is_fake, score, individual_scores = ensemble.predict(img_tensor, device)
print(f"Prediction: {'FAKE' if is_fake else 'REAL'}")
print(f"Anomaly Score: {score:.4f}")
print(f"Individual model scores: {individual_scores}")

πŸ“¦ Model Files

File Description Size
freq_vae.pth Enhanced Frequency VAE weights ~100 MB
semantic_svdd.pth Semantic Deep SVDD weights ~90 MB
edge_flow.pth Edge Normalizing Flow weights ~5 MB
texture_ocsvm.pkl Texture One-Class SVM ~200 MB
iforest.pkl Isolation Forest ~150 MB
lof.pkl Local Outlier Factor ~180 MB
gmm.pkl Gaussian Mixture Model ~50 MB
color_model.pkl Color Distribution Model ~10 MB
stat.pkl Statistical Model ~5 MB
config.json Ensemble configuration <1 MB
results_summary.json Training metrics <1 MB

πŸ”§ Requirements

torch>=2.0.0
torchvision>=0.15.0
numpy>=1.24.0
pillow>=9.0.0
scikit-learn>=1.3.0
scipy>=1.10.0
huggingface_hub>=0.19.0

🎯 Use Cases

  • Deepfake Detection: Identify AI-generated faces
  • Image Forensics: Detect manipulated images
  • Content Moderation: Filter synthetic content
  • Research: Study AI-generated image characteristics
  • Quality Control: Verify image authenticity

⚠️ Limitations

  • Trained on COCO real images - performance may vary on other domains
  • Requires 256Γ—256 input resolution
  • May struggle with heavily compressed or low-quality images
  • Performance depends on similarity between training and test distributions
  • Not designed for adversarial attacks

πŸ“ˆ Model Improvements

This version includes several accuracy enhancements:

  1. Phase Information: VAE uses both magnitude and phase of FFT
  2. Enhanced Features: More comprehensive texture and edge features
  3. Adaptive Threshold: Auto-calibrated at 95th percentile
  4. Optimized Weights: Balanced ensemble voting
  5. Extended Training: Up to 45 minutes per model for better convergence

πŸ“ Citation

@misc{fake-detection-ensemble-2024,
  author = {ash12321},
  title = {Fake Image Detection Ensemble - 9 Model System},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ash12321/fake-image-detection-ensemble}}
}

πŸ“„ License

MIT License - Free for research and commercial use

πŸ™ Acknowledgments

  • COCO Dataset for training data
  • PyTorch and scikit-learn communities
  • Hugging Face for model hosting

πŸ“ž Contact

Questions? Issues? Open an issue or discussion on this repository!


Note: This model was trained using single-class learning, making it robust to new types of fake images not seen during training. The ensemble approach combines multiple detection strategies for maximum accuracy and reliability.

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support