dejori
/

note-explain

patient-communication

Model card Files Files and versions

note-explain / README.md

dejori's picture

Upload README.md with huggingface_hub

688c96a verified 11 days ago

|

history blame contribute delete

2.69 kB

	---
	license: apache-2.0
	tags:
	- medical
	- clinical-notes
	- patient-communication
	- lora
	- peft
	- medgemma
	- gguf
	language:
	- en
	library_name: peft
	---

	# NoteExplain Models

	Trained models for clinical note simplification - translating medical documents into patient-friendly language.

	## Models

	\| Model \| Base \| Description \| Overall \| Accuracy \| Patient-Centered \|
	\|-------\|------\|-------------\|---------\|----------\|------------------\|
	\| gemma-2b-distilled \| gemma-2-2b-it \| Final mobile model \| 70% \| 73% \| 76% \|
	\| gemma-2b-dpo \| gemma-2-2b-it \| DPO comparison \| 73% \| 82% \| 61% \|
	\| gemma-9b-dpo \| gemma-2-9b-it \| Teacher model \| 79% \| 91% \| 70% \|

	## GGUF for Mobile/Local Inference

	Pre-quantized GGUF models (Q4_K_M, ~1.6GB each) for llama.cpp, Ollama, LM Studio:

	\| File \| Description \| Download \|
	\|------\|-------------\|----------\|
	\| `gguf/gemma-2b-distilled-q4_k_m.gguf` \| Distilled model (better patient communication) \| [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf) \|
	\| `gguf/gemma-2b-dpo-q4_k_m.gguf` \| DPO model (higher accuracy) \| [Download](https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-dpo-q4_k_m.gguf) \|

	### Quick Start with Ollama

	```bash
	# Download and run
	ollama run hf.co/dejori/note-explain:gemma-2b-distilled-q4_k_m.gguf
	```

	### Quick Start with llama.cpp

	```bash
	# Download
	wget https://huggingface.co/dejori/note-explain/resolve/main/gguf/gemma-2b-distilled-q4_k_m.gguf

	# Run
	./llama-cli -m gemma-2b-distilled-q4_k_m.gguf -p "Simplify this clinical note for a patient: [your note]"
	```

	## LoRA Adapters

	For fine-tuning or full-precision inference:

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the distilled model
	base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
	tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
	model = PeftModel.from_pretrained(base_model, "dejori/note-explain", subfolder="gemma-2b-distilled")

	# Generate
	prompt = "Simplify this clinical note for a patient:\n\n[clinical note]\n\nSimplified version:"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=512)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Training

	- DPO Training: MedGemma-27B scored 5 candidate outputs per clinical note, creating preference pairs
	- Distillation: 9B-DPO model generated high-quality outputs to train the 2B model via SFT

	## Dataset

	Training data: [dejori/note-explain](https://huggingface.co/datasets/dejori/note-explain)

	## License

	Apache 2.0