INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published 20 days ago • 69
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published 21 days ago • 15 • 4
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published 21 days ago • 15
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published 21 days ago • 15 • 4
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10 • 4
IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs Paper • 2504.15415 • Published Apr 21 • 22
Extrapolating Multilingual Understanding Models as Multilingual Generators Paper • 2305.13140 • Published May 22, 2023
Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment Paper • 2405.17871 • Published May 28, 2024 • 1
World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering Paper • 2409.20424 • Published Sep 30, 2024
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling Paper • 2501.16975 • Published Jan 28 • 31