|
|
--- |
|
|
tags: |
|
|
- autogluon |
|
|
- multimodal |
|
|
- image-classification |
|
|
- binary-classification |
|
|
- ensemble-learning |
|
|
- education |
|
|
- homework |
|
|
datasets: |
|
|
- ecopus/sign_identification |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# HW1 Sign Identification with AutoGluon |
|
|
|
|
|
## Model Description |
|
|
This repository contains an **AutoML image classification model trained with AutoGluon Multimodal** |
|
|
to identify two categories of sign images. |
|
|
The model was trained as part of **Homework 1** in CMU 24-679 (Designing and Deploying AI/ML). |
|
|
|
|
|
- **Framework**: [AutoGluon Multimodal](https://auto.gluon.ai/stable/tutorials/multimodal/index.html) |
|
|
- **Backbone**: TimmAutoModelForImagePrediction (~194M parameters) |
|
|
- **Task**: Binary image classification (`label`) |
|
|
- **Classes**: `0`, `1` |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Setup |
|
|
- **Dataset**: [ecopus/sign_identification](https://huggingface.co/datasets/ecopus/sign_identification) |
|
|
- `augmented` split (385 samples) → used for training/validation (80/20). |
|
|
- `original` split (35 samples) → reserved for final testing. |
|
|
- **Time budget**: 300 seconds (≈7 minutes). |
|
|
- **Hardware**: Colab GPU (CUDA 12.6, mixed precision). |
|
|
- **Presets**: `best_quality` (ensembling + hyperparameter tuning). |
|
|
|
|
|
--- |
|
|
|
|
|
## Results |
|
|
- **Validation ROC-AUC**: 0.998 |
|
|
- **Test Accuracy**: 97.1% |
|
|
- **Weighted F1**: 97.1% |
|
|
|
|
|
### Classification Report (Test Set) |
|
|
|
|
|
|
|
|
precision recall f1-score support |
|
|
0 0.95 1.00 0.97 19 |
|
|
1 1.00 0.94 0.97 16 |
|
|
accuracy 0.97 35 |
|
|
|
|
|
|
|
|
macro avg 0.97 0.97 0.97 35 |
|
|
weighted avg 0.97 0.97 0.97 35 |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use |
|
|
|
|
|
### Install requirements |
|
|
```bash |
|
|
pip install autogluon.multimodal huggingface_hub cloudpickle |
|
|
|
|
|
import cloudpickle |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
pkl_path = hf_hub_download( |
|
|
repo_id="cassieli226/sign-identification-automl", |
|
|
filename="autogluon_predictor.pkl", |
|
|
repo_type="model" |
|
|
) |
|
|
|
|
|
with open(pkl_path, "rb") as f: |
|
|
predictor = cloudpickle.load(f) |
|
|
|
|
|
# Predict on new data (expects DataFrame with 'image' column containing file paths) |
|
|
import pandas as pd |
|
|
X_test = pd.DataFrame({"image": ["path/to/your/image.png"]}) |
|
|
print(predictor.predict(X_test)) |
|
|
|
|
|
|
|
|
import pathlib, shutil, zipfile |
|
|
from huggingface_hub import hf_hub_download |
|
|
import autogluon.multimodal as ag |
|
|
import pandas as pd |
|
|
|
|
|
zip_path = hf_hub_download( |
|
|
repo_id="cassieli226/sign-identification-automl", |
|
|
filename="autogluon_predictor_dir.zip", |
|
|
repo_type="model" |
|
|
) |
|
|
|
|
|
extract_dir = pathlib.Path("predictor_dir") |
|
|
if extract_dir.exists(): |
|
|
shutil.rmtree(extract_dir) |
|
|
with zipfile.ZipFile(zip_path, "r") as zf: |
|
|
zf.extractall(str(extract_dir)) |
|
|
|
|
|
predictor = ag.MultiModalPredictor.load(str(extract_dir)) |
|
|
print(predictor.predict(pd.DataFrame({"image": ["path/to/your/image.png"]}))) |
|
|
|
|
|
--- |
|
|
|
|
|
#Intended Use |
|
|
- Coursework demonstration of AutoML for neural networks on images. |
|
|
- Educational example for using augmented vs. original splits for training and evaluation. |
|
|
#Limitations |
|
|
- Trained on a small student-collected dataset (≈420 images). |
|
|
- Accuracy may not generalize to unseen real-world data. |
|
|
- Model assumes binary labels only (0, 1). |
|
|
#Ethical Notes |
|
|
- Dataset is non-sensitive, contains no personal information. |
|
|
- Augmentation was applied responsibly to avoid unrealistic samples. |
|
|
# References |
|
|
- Dataset: ecopus/sign_identification |
|
|
- Framework: AutoGluon |
|
|
- OpenAI’s ChatGPT (2025) was used for code generation, structuring, and debugging. |
|
|
|