alexanderfeix's picture
Update README.md
0e03474 verified
metadata
language:
  - en
base_model:
  - unsloth/Qwen3-1.7B-unsloth-bnb-4bit
pipeline_tag: text-classification

Fine-tuned Qwen3-1.7B-Instruct β€” From Doctor Notes πŸ‘¨πŸΌβ€βš•οΈ to JSON πŸ—’οΈ

Task: Convert short doctor/therapist notes into JSON, in format:

  • summary (string)
  • tags (comma-separated)
  • risk-level (0–10 integer)

Base Model

πŸš€ Very lightweight, runs locally on almost any doctor computer, which ensures data privacy on confidential medical patient data.

  • unsloth/Qwen3-1.7B-unsloth-bnb-4bit

Training

  • Method: QLoRA (r=16, alpha=32, dropout=0.03)
  • Target modules: q_proj,k_proj,v_proj,o_proj
  • Context length: 2048
  • Optimizer: adamw_8bit
  • Time: One epoch, 26 min on one L4 GPU

Dataset

A total of 4524 training pairs, consisting of input doctor notes and the JSON data as output. During training, 565 evaluation pairs were used and 395 for final model testing.
Around 60% is crawled reddit data from subreddits like r/depression, the other 40% were synthetically generated by GPT-5-mini.
Example data format:

{"input": "You are a clinical note assistant. Given terse doctor notes from a patient session, produce a JSON with fields summary (clear, neutral), tags (comma-separated), and risk-level (0-10). Only output valid JSON.\n\nDoctor notes:\nPatient reports recurrent flashbacks and nightmares after military deployment and avoids reminders. States occasional passive thoughts about death but no plan or intent; increased startle and hypervigilance noted. Continue trauma-focused therapy and safety planning reviewed.", "output": "{\"summary\": \"Patient reports recurrent PTSD symptoms with flashbacks, nightmares, avoidance, hypervigilance, and occasional passive thoughts about death but no plan or intent.\", \"tags\": \"PTSD,Anxiety,Self-harm\", \"risk-level\": 6}"}

Evaluation Results

Metric Value of FT-Model Value of Base-Model Improvement
JSON validity rate 0.9848 0.9570 +2.9% βœ…
Tag precision 0.7540 0.1850 +307.6% βœ…
Tag recall 0.7159 0.3406 +110.2% βœ…
Tag F1 score 0.7344 0.2398 +206.3% βœ…
Tag exact match 0.2648 0.0000
Risk MAE 0.7352 2.2434 βˆ’67.2% (lower is better) βœ…
Risk RMSE 1.0779 2.7898 βˆ’61.4% (lower is better) βœ…
Rouge F1 score 0.4828 0.4240 +13.9% βœ…
High risk recall 0.9878 0.9250 +6.8% βœ…
High risk precision 0.8804 0.4868 +80.9% βœ…
High risk F1 score 0.9310 0.6379 +45.9% βœ…

A more comprehensive model evaluation with additional plots can be found in the GitHub repository

Intended Use & Limitations

  • For summarizing structured notes only. Not a diagnostic tool.
  • High-risk predictions (β‰₯8) should be reviewed by a clinician.

Prompt format

Use the chat template shipped here.

<|im_start|>system  
You are a clinical note assistant. Given terse doctor notes from a patient session, output ONLY valid JSON with fields: summary (clear, neutral), tags (comma-separated), and risk-level (0-10).<|im_end|>  
<|im_start|>user  
Patient reports feeling increasingly anxious about work deadlines and has trouble sleeping at night. She mentions a racing mind and difficulty concentrating during the day. No self-harm thoughts expressed.<|im_end|>  
<|im_start|>assistant  
<think>  

</think>  

{  
  "summary": "e.g: Patient reports increased anxiety about work deadlines, difficulty sleeping, racing mind, and trouble concentrating during the day. No self-harm thoughts.",  
  "tags": "e.g: anxiety, insomnia, concentration, stress",  
  "risk-level": e.g: 4  
}<|im_end|>