avanishd
/

vit-base-patch16-dinov3-finetuned-skin-lesion-classification

Safetensors

Eval Results

Model card Files Files and versions

xet

Community

avanishd commited on 15 days ago

Commit

fb552e4

verified ·

1 Parent(s): 32d3642

Create README.md

Browse files

Files changed (1) hide show

README.md +242 -0

README.md ADDED Viewed

	@@ -0,0 +1,242 @@

+---
+metrics:
+- pAUC
+model-index:
+- name: >-
+    avanishd/avanishd/vit-base-patch16-dinov3-finetuned-skin-lesion-classification
+  results:
+  - task:
+      name: Image Classification
+      type: image-classification
+    metrics:
+    - name: pAUC
+      type: pAUC
+      value: 0.1441070826953209
+base_model:
+- timm/vit_base_patch16_dinov3.lvd1689m
+---
+# vit-base-patch16-dinov3-finetuned-skin-lesion-classification
+This model is a finetuned for skin lesion classification.
+## Intended Uses & Limitations
+### Intended Use
+This model is intended for dermoscopic skin lesion classification using a 224x224 image size.
+### Limitations
+This model was only trained for 1 epoch and has not seen many malignant examples (due to large class imbalance in ISIC 2024 dataset).
+## How to Get Started with the Model
+```Python
+class DinoSkinLesionClassifier(nn.Module, PyTorchModelHubMixin):
+  """
+  PytorchModelHubMixin adds push to Hugging Face Hub
+  See: https://huggingface.co/docs/hub/models-uploading#upload-a-pytorch-model-using-huggingfacehub
+  """
+  def __init__(self, num_classes=1, freeze_backbone=True):
+    super(DinoSkinLesionClassifier, self).__init__()
+    # Initialize Dino v3 backbone
+    self.backbone = timm.create_model('vit_base_patch16_dinov3', pretrained=True, num_classes=0, global_pool='avg')
+    # Freeze backbone weights if requested
+    # This makes training much faster
+    if freeze_backbone:
+      for param in self.backbone.parameters():
+        param.requires_grad = False
+    # Get feature dimension from the backbone
+    feat_dim = self.backbone.num_features
+    # Define the classification head
+    self.head = nn.Linear(feat_dim, num_classes) # Should be 768 in, 1 out
+  def forward(self, x):
+    out = self.backbone(x)
+    out = self.head(out)
+    return out
+from huggingface_hub import hf_hub_download
+weights_path = hf_hub_download(
+        repo_id="avanishd/vit-base-patch16-dinov3-finetuned-skin-lesion-classification",
+        filename="model.safetensors"
+    )
+from safetensors.torch import load_file
+model = EfficientNetSkinLesionClassifier()
+state = load_file(weights_path)
+model.load_state_dict(state, strict=True)
+model.eval()  # Set model to evaluation mode
+model.to(device) # Don't forget to put on GPU
+# Example with PH2 Dataset
+class PH2Dataset(Dataset):
+  """
+  Dataset for PH2 images, which are in png format.
+  PH2 contains skin lesions images classified as
+  - Common Nevus (benign)
+  - Atypical Nevus (benign)
+  - Melanoma (malignant)
+  No need for is real label here, since this is purely for testing
+  """
+  def __init__(self, dir_path, metadata, transform=None):
+    super(PH2Dataset, self).__init__()
+    self.dir_path = dir_path
+    self.transform = transform
+    self.image_files = [os.path.join(dir_path, f) for f in os.listdir(dir_path)
+                        if f.lower().endswith(('.jpg', '.jpeg', '.png'))]
+    # Load metadata w/ polars (only 2 columns)
+    self.metadata = pl.read_csv(metadata)
+    self.diagnostic_mapping = {
+        "Common Nevus": 0,
+        "Atypical Nevus": 0,
+        "Melanoma": 1,
+    }
+  def __len__(self):
+    return len(self.image_files)
+  def __getitem__(self, idx):
+    # The image name in the metadata csv are like IMD003
+    image_id = self.image_files[idx].split('/')[-1].split('.')[0]
+    # Still need the entire path to open the image
+    image = Image.open(self.image_files[idx]).convert('RGB')
+    if self.transform: # Apply transform if it exists
+      image = self.transform(image)
+    diagnosis = self.metadata.filter(pl.col("image_name") == image_id).select("diagnosis").item()
+    label = torch.tensor(self.diagnostic_mapping[diagnosis], dtype=torch.int16)
+    return image, label
+transform = transforms.Compose([
+    transforms.ToTensor(),
+    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # Image net mean and std
+    transforms.Resize((224, 224)),  # Dimensions for Efficient Net v2
+])
+ph_2_images = "/content/data/ph2_data/images"
+ph_2_metadata = "/content/data/ph2_data/ph_2_dataset.csv"
+ex_dataset = PH2Dataset(ph_2_images, ph_2_metadata, transform)
+ex_loader = DataLoader(ex_dataset, batch_size=64, shuffle=False)
+for (images, labels) in test_loader:
+  images = images.to(device)
+  labels = labels.to(device)
+  output = model(images)
+  y_pred_prob = torch.sigmoid(output).cpu().numpy().ravel()
+  y_pred = np.where(y_pred_prob < 0.5, 0, 1)
+  return y_pred
+```
+## Training and evaluation data
+This model was trained with the [ISIC 2024 challenge](https://www.kaggle.com/competitions/isic-2024-challenge) and [ISIC 2024 synthetic](https://www.kaggle.com/datasets/ilya9711nov/isic-2024-synthetic) datasets.
+For the ISIC 2024 Challenge data, an 80-20 train test split was applied, and the test split was used to evaluate the model.
+## Training Procedure
+### Training hyperparameters
+- learning_rate: 1e-4
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with weight decay=1e-2 and optimizer_args=No additional optimizer arguments
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step |
+|---------------|-------|------|
+| 0.5027 | 1 | 100 |
+| 0.5672 | 1 | 200 |
+| 0.5373 | 1 | 300 |
+| 0.4693 | 1 | 400 |
+| 5.3829 | 1 | 500 |
+| 0.4872 | 1 | 600 |
+| 0.4717 | 1 | 700 |
+| 0.4550 | 1 | 800 |
+| 0.4185 | 1 | 900 |
+| 0.4142 | 1 | 1000 |
+| 0.3570 | 1 | 1100 |
+| 0.3877 | 1 | 1200 |
+| 0.4282 | 1 | 1300 |
+| 8.8676 | 1 | 1400 |
+| 0.3732 | 1 | 1500 |
+| 0.3522 | 1 | 1600 |
+| 0.3065 | 1 | 1700 |
+| 0.3732 | 1 | 1800 |
+| 0.3965 | 1 | 1900 |
+| 0.4727 | 1 | 2000 |
+| 0.3407 | 1 | 2100 |
+| 0.3421 | 1 | 2200 |
+| 0.3847 | 1 | 2300 |
+| 0.3911 | 1 | 2400 |
+| 0.4006 | 1 | 2500 |
+| 0.2836 | 1 | 2600 |
+| 0.3968 | 1 | 2700 |
+| 0.3796 | 1 | 2800 |
+| 0.3317 | 1 | 2900 |
+| 0.2762 | 1 | 3000 |
+| 0.3027 | 1 | 3100 |
+| 0.3002 | 1 | 3200 |
+| 0.3672 | 1 | 3300 |
+| 0.2660 | 1 | 3400 |
+| 0.3145 | 1 | 3500 |
+| 0.4098 | 1 | 3600 |
+| 0.3156 | 1 | 3700 |
+| 0.2762 | 1 | 3800 |
+| 0.2557 | 1 | 3900 |
+| 0.3204 | 1 | 4000 |
+| 0.3097 | 1 | 4100 |
+| 0.2790 | 1 | 4200 |
+| 0.3395 | 1 | 4300 |
+| 0.2888 | 1 | 4400 |
+| 0.3002 | 1 | 4500 |
+| 0.3388 | 1 | 4600 |
+| 0.3744 | 1 | 4700 |
+| 0.3143 | 1 | 4800 |
+| 0.3501 | 1 | 4900 |
+| 0.2923 | 1 | 5000 |
+| 0.3152 | 1 | 5100 |
+| 0.3380 | 1 | 5200 |
+### Framework versions
+- Pytorch 2.9.0+cu126
+- torchvision: 0.24.0+cu126
+- timm: 1.0.22
+- numpy: 2.0.2
+- safetensors: 0.7.0