Sleeping
Wechat Style Sft
🌍
QLoRA fine-tuning demo on WeChat essays for style adaptation
None defined yet.
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization
Diffusion Language Models are Super Data Learners
The official Hugging Face space for the National University of Singapore. Users with a verified email ending in nus.edu.sg or u.nus.edu are welcome to join.