MedGemma 4B Anti-Hallucination

MedGemma 4B fine-tuned with RLFR (Reinforcement Learning from Feature Rewards) to reduce hallucination in pharmacovigilance document generation.

  • Base model: google/medgemma-4b-it
  • Method: RLFR with 6 reward features (factual accuracy, completeness, format compliance, etc.)
  • Task: Generate MedWatch 3500A pharmacovigilance narratives from CRF data

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AlphaRaven/medgemma-4b-antihallu")
tokenizer = AutoTokenizer.from_pretrained("AlphaRaven/medgemma-4b-antihallu")

Part of the Clinical Trial Simulation Engine pipeline.

Downloads last month
27
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlphaRaven/medgemma-4b-antihallu

Finetuned
(550)
this model
Quantizations
1 model