| | --- |
| | license: apache-2.0 |
| | tags: |
| | - medical |
| | - clinical-notes |
| | - patient-communication |
| | - lora |
| | - peft |
| | - medgemma |
| | - gguf |
| | language: |
| | - en |
| | library_name: peft |
| | --- |
| | |
| | # NoteExplain Models |
| |
|
| | Trained models for clinical note simplification - translating medical documents into patient-friendly language. |
| |
|
| | ## Models |
| |
|
| | | Model | Base | Description | Overall | Accuracy | Patient-Centered | |
| | |-------|------|-------------|---------|----------|------------------| |
| | | **gemma-2b-distilled** | gemma-2-2b-it | Final mobile model | 70% | 73% | **76%** | |
| | | **gemma-2b-dpo** | gemma-2-2b-it | DPO comparison | **73%** | **82%** | 61% | |
| | | **gemma-9b-dpo** | gemma-2-9b-it | Teacher model | 79% | 91% | 70% | |
| |
|
| | ## GGUF for Mobile/Local Inference |
| |
|
| | Pre-quantized GGUF models (Q4_K_M, ~1.6GB each) for llama.cpp, Ollama, LM Studio: |
| |
|
| | | File | Description | Download | |
| | |------|-------------|----------| |
| | | `gguf/gemma-2b-distilled-q4_k_m.gguf` | Distilled model (better patient communication) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf) | |
| | | `gguf/gemma-2b-dpo-q4_k_m.gguf` | DPO model (higher accuracy) | [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-dpo-q4_k_m.gguf) | |
| |
|
| | ### Quick Start with Ollama |
| |
|
| | ```bash |
| | # Download and run |
| | ollama run hf.co/dejori/note-explain:gemma-2b-distilled-q4_k_m.gguf |
| | ``` |
| |
|
| | ### Quick Start with llama.cpp |
| |
|
| | ```bash |
| | # Download |
| | wget https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf |
| | |
| | # Run |
| | ./llama-cli -m gemma-2b-distilled-q4_k_m.gguf -p "Simplify this clinical note for a patient: [your note]" |
| | ``` |
| |
|
| | ## LoRA Adapters |
| |
|
| | For fine-tuning or full-precision inference: |
| |
|
| | ```python |
| | from peft import PeftModel |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | # Load the distilled model |
| | base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it") |
| | tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") |
| | model = PeftModel.from_pretrained(base_model, "dejori/note-explain", subfolder="gemma-2b-distilled") |
| | |
| | # Generate |
| | prompt = "Simplify this clinical note for a patient:\n\n[clinical note]\n\nSimplified version:" |
| | inputs = tokenizer(prompt, return_tensors="pt") |
| | outputs = model.generate(**inputs, max_new_tokens=512) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ## Training |
| |
|
| | - **DPO Training**: MedGemma-27B scored 5 candidate outputs per clinical note, creating preference pairs |
| | - **Distillation**: 9B-DPO model generated high-quality outputs to train the 2B model via SFT |
| |
|
| | ## Dataset |
| |
|
| | Training data: [dejori/note-explain](https://huggingface.co/datasets/dejori/note-explain) |
| |
|
| | ## License |
| |
|
| | Apache 2.0 |
| |
|