MedGemma 1.5 Surgical Navigation Assistant
Tool-augmented VLM for brain tumor resection guidance, built on MedGemma 1.5 4B.
Overview
This repository contains two LoRA adapters fine-tuned from MedGemma 1.5 4B for intraoperative brain tumor surgical navigation:
- Tissue Classification LoRA (
tissue_lora/) โ Classifies brain tissue types from MRI slices with crosshair localization - Reasoning LoRA (
reasoning_lora/) โ Generates grounded surgical guidance traces distilled from Gemini 3 Flash
Architecture
The system uses a scaffold-and-verify approach where pre-computed measurements from segmentation masks and brain atlases are injected into MedGemma's context, ensuring the VLM reasons over verified facts rather than hallucinating spatial measurements.
MRI Input (FLAIR)
|
[GT Segmentation Mask] --> 100% accurate tissue lookup
|
[Brain Atlases (Harvard-Oxford, JHU)] --> Region identification + eloquent cortex warnings
|
[Geometry Tools] --> Distances, volumes, margins
|
v
[MedGemma 1.5 4B + LoRA] <-- Image + verified measurements injected
|
v
Grounded clinical guidance
Results
Tissue Classification (single-slice, crosshair)
| Method | Tumor | Edema |
|---|---|---|
| MedGemma 1.5-IT (base) | 100% | 0% |
| + LoRA fine-tune | 67% | 89% |
| + GT Scaffold | 100% | 100% |
Reasoning Distillation (multi-slice trajectory, n=20)
| Metric | Base MedGemma | Distilled LoRA | Gemini 3 Flash (Teacher) |
|---|---|---|---|
| Grounding | 90% | 95% | 95% |
| Quality Score (0-4) | 2.3 | 3.2 | 3.6 |
| Per-slice accuracy | 47% | 79% | 81% |
Adapter Details
Both adapters use the same LoRA configuration:
- Rank: 16
- Alpha: 32
- Dropout: 0.05
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, fc1, fc2, out_proj
- Task type: CAUSAL_LM
Usage
from transformers import AutoProcessor, AutoModelForImageTextToText, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForImageTextToText.from_pretrained(
"google/medgemma-1.5-4b-it",
torch_dtype=torch.bfloat16,
device_map="auto",
quantization_config=quantization_config,
)
processor = AutoProcessor.from_pretrained("google/medgemma-1.5-4b-it")
# Load reasoning LoRA adapter
model = PeftModel.from_pretrained(model, "Summicron50mm/medgemma-surgical-nav", subfolder="reasoning_lora")
Links
- Live Demo: HuggingFace Spaces
- Code: GitHub Repository
- Base Model: google/medgemma-1.5-4b-it
- Dataset: BraTS 2021
Citation
Built for the Kaggle MedGemma Impact Challenge.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for Summicron50mm/medgemma-surgical-nav
Base model
google/medgemma-1.5-4b-it