Collections
Discover the best community collections!
Collections including paper arxiv:2402.14905
-
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 30 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24 -
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 14
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 46 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 34 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 14 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
Sensor-based Multi-Robot Search and Coverage with Spatial Separation in Unstructured Environments
Paper • 2403.01710 • Published • 2 -
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Paper • 2308.14352 • Published -
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems
Paper • 2306.12691 • Published • 2
-
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Paper • 2402.14797 • Published • 21 -
Subobject-level Image Tokenization
Paper • 2402.14327 • Published • 19 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 22
-
BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation
Paper • 2401.17053 • Published • 33 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Paper • 2402.05930 • Published • 39
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 46 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 34 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 14 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
Sensor-based Multi-Robot Search and Coverage with Spatial Separation in Unstructured Environments
Paper • 2403.01710 • Published • 2 -
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Paper • 2308.14352 • Published -
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems
Paper • 2306.12691 • Published • 2
-
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Paper • 2402.14797 • Published • 21 -
Subobject-level Image Tokenization
Paper • 2402.14327 • Published • 19 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 22
-
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 30 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 134 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24 -
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 14
-
BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation
Paper • 2401.17053 • Published • 33 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 131 -
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Paper • 2402.05930 • Published • 39