Collections
Discover the best community collections!
Collections including paper arXiv:2411.19146
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 51 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 47
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 41 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 42
-
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 43 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments
Paper • 2408.10945 • Published • 11 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper • 2412.11768 • Published • 43 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments
Paper • 2408.10945 • Published • 11 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 36 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 51 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 68 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 47
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 41 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 42