Lexo-Sort SLMs trained to perform lexograp vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort-SFT-v0 Text Generation • 0.5B • Updated Jul 1 • 11 vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort-SFT-v1 Text Generation • 0.5B • Updated Jul 4 • 10 vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort 0.5B • Updated Jul 4 • 4 vijay-ravichander/V3-lexo-sort Viewer • Updated Jul 8 • 1k • 16
LLM FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 25 LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57
FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 25
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57
ColSmol256 Distill Models vijay-ravichander/Qwen-KL-Distill 0.2B • Updated Apr 30 • 6 vijay-ravichander/Smol-Pairwise-Distill 0.2B • Updated Apr 30 • 8 vijay-ravichander/Qwen-MMSE-Distill 0.2B • Updated Apr 30 • 6 vijay-ravichander/Qwen-Pairwise-Distill 0.2B • Updated Apr 30 • 23
Lexo-Sort SLMs trained to perform lexograp vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort-SFT-v0 Text Generation • 0.5B • Updated Jul 1 • 11 vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort-SFT-v1 Text Generation • 0.5B • Updated Jul 4 • 10 vijay-ravichander/Qwen2.5-0.5B-Lexo-Sort 0.5B • Updated Jul 4 • 4 vijay-ravichander/V3-lexo-sort Viewer • Updated Jul 8 • 1k • 16
ColSmol256 Distill Models vijay-ravichander/Qwen-KL-Distill 0.2B • Updated Apr 30 • 6 vijay-ravichander/Smol-Pairwise-Distill 0.2B • Updated Apr 30 • 8 vijay-ravichander/Qwen-MMSE-Distill 0.2B • Updated Apr 30 • 6 vijay-ravichander/Qwen-Pairwise-Distill 0.2B • Updated Apr 30 • 23
LLM FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 25 LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57
FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 25
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 57