NoteExplain Models

Trained models for clinical note simplification - translating medical documents into patient-friendly language.

Models

Model	Base	Description	Overall	Accuracy	Patient-Centered
gemma-2b-distilled	gemma-2-2b-it	Final mobile model	70%	73%	76%
gemma-2b-dpo	gemma-2-2b-it	DPO comparison	73%	82%	61%
gemma-9b-dpo	gemma-2-9b-it	Teacher model	79%	91%	70%

GGUF for Mobile/Local Inference

Pre-quantized GGUF models (Q4_K_M, ~1.6GB each) for llama.cpp, Ollama, LM Studio:

File	Description	Download
`gguf/gemma-2b-distilled-q4_k_m.gguf`	Distilled model (better patient communication)	Download
`gguf/gemma-2b-dpo-q4_k_m.gguf`	DPO model (higher accuracy)	Download

Quick Start with Ollama

# Download and run
ollama run hf.co/dejori/note-explain:gemma-2b-distilled-q4_k_m.gguf

Quick Start with llama.cpp

# Download
wget https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf

# Run
./llama-cli -m gemma-2b-distilled-q4_k_m.gguf -p "Simplify this clinical note for a patient: [your note]"

LoRA Adapters

For fine-tuning or full-precision inference:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the distilled model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
model = PeftModel.from_pretrained(base_model, "dejori/note-explain", subfolder="gemma-2b-distilled")

# Generate
prompt = "Simplify this clinical note for a patient:\n\n[clinical note]\n\nSimplified version:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training

DPO Training: MedGemma-27B scored 5 candidate outputs per clinical note, creating preference pairs
Distillation: 9B-DPO model generated high-quality outputs to train the 2B model via SFT

Dataset

Training data: dejori/note-explain

License

Apache 2.0

Downloads last month: 49

GGUF

Model size

3B params

Architecture

gemma2

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support