File size: 2,688 Bytes
d3cd80a 688c96a d3cd80a 688c96a d3cd80a 688c96a d3cd80a a735493 d3cd80a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | ---
license: apache-2.0
tags:
- medical
- clinical-notes
- patient-communication
- lora
- peft
- medgemma
- gguf
language:
- en
library_name: peft
---
# NoteExplain Models
Trained models for clinical note simplification - translating medical documents into patient-friendly language.
## Models
| Model | Base | Description | Overall | Accuracy | Patient-Centered |
|-------|------|-------------|---------|----------|------------------|
| **gemma-2b-distilled** | gemma-2-2b-it | Final mobile model | 70% | 73% | **76%** |
| **gemma-2b-dpo** | gemma-2-2b-it | DPO comparison | **73%** | **82%** | 61% |
| **gemma-9b-dpo** | gemma-2-9b-it | Teacher model | 79% | 91% | 70% |
## GGUF for Mobile/Local Inference
Pre-quantized GGUF models (Q4_K_M, ~1.6GB each) for llama.cpp, Ollama, LM Studio:
| File | Description | Download |
|------|-------------|----------|
| `gguf/gemma-2b-distilled-q4_k_m.gguf` | Distilled model (better patient communication) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf) |
| `gguf/gemma-2b-dpo-q4_k_m.gguf` | DPO model (higher accuracy) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-dpo-q4_k_m.gguf) |
### Quick Start with Ollama
```bash
# Download and run
ollama run hf.co/dejori/note-explain:gemma-2b-distilled-q4_k_m.gguf
```
### Quick Start with llama.cpp
```bash
# Download
wget https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf
# Run
./llama-cli -m gemma-2b-distilled-q4_k_m.gguf -p "Simplify this clinical note for a patient: [your note]"
```
## LoRA Adapters
For fine-tuning or full-precision inference:
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the distilled model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
model = PeftModel.from_pretrained(base_model, "dejori/note-explain", subfolder="gemma-2b-distilled")
# Generate
prompt = "Simplify this clinical note for a patient:\n\n[clinical note]\n\nSimplified version:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Training
- **DPO Training**: MedGemma-27B scored 5 candidate outputs per clinical note, creating preference pairs
- **Distillation**: 9B-DPO model generated high-quality outputs to train the 2B model via SFT
## Dataset
Training data: [dejori/note-explain](https://huggingface.co/datasets/dejori/note-explain)
## License
Apache 2.0
|