HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-lr-5e-6 Text Generation • 16B • Updated Sep 19, 2025 • 2
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch Text Generation • 16B • Updated Sep 19, 2025 • 2 • 1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-remove-aux-only Text Generation • 126k • Updated Sep 18, 2025 • 1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-3-gamma Text Generation • 126k • Updated Sep 18, 2025 • 3
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-bs4 Text Generation • 16B • Updated Sep 17, 2025 • 2
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma Text Generation • 16B • Updated Sep 17, 2025 • 6
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4-4.51 Text Generation • 16B • Updated Sep 17, 2025 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-1e-4-gamma Updated Sep 17, 2025
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4 Text Generation • 126k • Updated Sep 17, 2025 • 2
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-3epoch-bias-update 14B • Updated Sep 17, 2025 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2-run1 Text Generation • 14B • Updated Sep 17, 2025 • 5
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-1epoch Text Generation • 14B • Updated Sep 17, 2025 • 2
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-5-gamma-3epoch-bias-update 14B • Updated Sep 16, 2025 • 2
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma-3epoch Text Generation • 14B • Updated Sep 15, 2025 • 2
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-remov-aux-only Text Generation • 14B • Updated Sep 15, 2025 • 4 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-6-gamma Text Generation • 14B • Updated Sep 15, 2025 • 1 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-2-gamma Text Generation • 14B • Updated Sep 15, 2025 • 2 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma Text Generation • 14B • Updated Sep 15, 2025 • 3 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma Text Generation • 14B • Updated Sep 15, 2025 • 1 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-5e-5-gamma Text Generation • 14B • Updated Sep 15, 2025 • 2 • 1
HectorHe/Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts Text Generation • 16B • Updated Aug 18, 2025 • 14 • 1