PainFormer

A Vision Foundation Model for Affective Computing

PainFormer · 19.60 M parameters · 5.82 GFLOPs · 160-D embeddings · PyTorch ≥ 2.0


Paper

PainFormer: A Vision Foundation Model for Automatic Pain Assessment


Highlights

Feature Description
Pre-training scale Multi-task pre-training on 14 tasks / 10.9 M samples.
Parameters 19.60 M (PainFormer encoder).
Compute 5.82 GFLOPs at 224×224 input.
Embeddings Fixed 160-D output vectors.

PainFormer overview

Figure 1. PainFormer overview.

PainFormer architecture

Figure 2. PainFormer architecture.


Table of Contents

  1. Pre-trained checkpoint
  2. Quick start
  3. Fine-tuning
  4. Citation
  5. Licence & acknowledgements
  6. Contact

Pre-trained Checkpoint

The checkpoint is stored under checkpoint/ in this repository.

File Size
checkpoint/painformer.pth 75 MB

Download options

# direct file download (PainFormer)
mkdir -p checkpoint
wget https://huggingface.co/stefanosgikas/PainFormer/resolve/main/checkpoint/painformer.pth
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
    repo_id="stefanosgikas/PainFormer",
    filename="checkpoint/painformer.pth"
)
print(ckpt_path)

Optional integrity check:

sha256sum checkpoint/painformer.pth

The checkpoint contains:

model_state_dict    # PainFormer backbone weights

Quick start

Assumes PyTorch ≥ 2.0 and timm ≥ 0.9 are installed.

Repository layout expected:

.
├── docs/                    # images for the model card
├── architecture/            # Python modules (e.g., painformer.py)
└── checkpoint/              # painformer.pth

Extract embeddings

import torch
from timm.models import create_model
from PIL import Image
from torchvision import transforms

# model code lives in the local "architecture" folder
from architecture import painformer  # ensures registry / model class is imported

# ---------------------------------------------------------------
# Setup ---------------------------------------------------------
# ---------------------------------------------------------------
device = "cuda" if torch.cuda.is_available() else "cpu"

# VGG-Face2 statistics used during pretraining
normalize = transforms.Normalize(
    mean=[0.6068, 0.4517, 0.3800],
    std=[0.2492, 0.2173, 0.2082]
)
to_tensor = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    normalize
])

# ---------------------------------------------------------------
# Load PainFormer -----------------------------------------------
# ---------------------------------------------------------------
model = create_model('painformer').to(device)   # class registered by architecture/painformer.py
state = torch.load('checkpoint/painformer.pth', map_location=device)
model.load_state_dict(state['model_state_dict'], strict=False)

# expose embeddings (remove classification head)
model.head = torch.nn.Identity()
model.eval()

# ---------------------------------------------------------------
# One image → 160-D embedding -----------------------------------
# ---------------------------------------------------------------
img = Image.open('frame.png').convert('RGB')
x = to_tensor(img).unsqueeze(0).to(device)  # [1, 3, 224, 224]

with torch.no_grad():
    emb = model(x)        # [1, 160]
    emb = emb.squeeze(0)  # [160]

print("Embedding shape:", tuple(emb.shape))  # (160,)

Fine-tuning

Add your own classification/regression head and (optionally) un-freeze the backbone:

import torch, torch.nn as nn
from timm.models import create_model
from architecture import painformer

device = "cuda" if torch.cuda.is_available() else "cpu"
num_classes = 3  # set to your task

# Backbone → 160-D embeddings
model = create_model('painformer').to(device)
state = torch.load('checkpoint/painformer.pth', map_location=device)
model.load_state_dict(state['model_state_dict'], strict=False)

# freeze if you only need fixed embeddings
for p in model.parameters():
    p.requires_grad = False

# simple head (example)
head = nn.Sequential(
    nn.ELU(),
    nn.Linear(160, num_classes)
).to(device)

optimizer = torch.optim.Adam(head.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

# optional: end-to-end fine-tune
for p in model.parameters():
    p.requires_grad = True
optimizer = torch.optim.AdamW(
    list(model.parameters()) + list(head.parameters()),
    lr=3e-4, weight_decay=0.05
)

Citation

@ARTICLE{gkikas_painformer_2025,
  author={Gkikas, Stefanos and Rojas, Raul Fernandez and Tsiknakis, Manolis},
  journal={IEEE Transactions on Affective Computing}, 
  title={PainFormer: a Vision Foundation Model for Automatic Pain Assessment}, 
  year={2025},
  volume={},
  number={},
  pages={1-18},
  doi={10.1109/TAFFC.2025.3605475}
}

Licence & acknowledgements

  • Code & weights: MIT Licence – see LICENSE.

Contact

Email Stefanos Gkikas: gkikas[at]ics[dot]forth[dot]gr / gikasstefanos[at]gmail[dot]com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support