Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
thangvip
/
qwen3-4b-legal-sft-grpo-phase-2
like
0
Text Generation
Transformers
Safetensors
qwen3
Generated from Trainer
trl
grpo
unsloth
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
qwen3-4b-legal-sft-grpo-phase-2
/
.gitattributes
Commit History
Model save
31e7095
verified
thangvip
commited on
26 days ago
initial commit
25ed7ad
verified
thangvip
commited on
27 days ago