INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published 17 days ago • 68
🦙 Llama-3.2-Taiwan Collection Based on the meta-llama/Llama-3.2-*B model, we continue pre-training on a large corpus of Traditional Chinese and non-Chinese language data. • 9 items • Updated Apr 26 • 1
UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset Paper • 2510.20661 • Published 23 days ago • 13
🏎️ Formosa-1 Series Collection A collection of Formosa-1 (F1) reasoning models and datasets focused on Traditional Chinese instruction-following and logic. • 4 items • Updated Oct 13 • 4
📋 Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt. • 2 items • Updated Oct 13 • 4
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 205
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28 • 102
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 379
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 75
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 711
view article Article Topic 23: What is LLM Inference, it's challenges and solutions for it Jan 17 • 17