-
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Perspectives on the State and Future of Deep Learning - 2023
Paper • 2312.09323 • Published • 8 -
MobileSAMv2: Faster Segment Anything to Everything
Paper • 2312.09579 • Published • 24 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2312.10035
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 31 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 6 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 33
-
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper • 2312.07987 • Published • 41 -
Interfacing Foundation Models' Embeddings
Paper • 2312.07532 • Published • 15 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
TheBloke/quantum-v0.01-GPTQ
Text Generation • 1B • Updated • 2 • 2
-
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Perspectives on the State and Future of Deep Learning - 2023
Paper • 2312.09323 • Published • 8 -
MobileSAMv2: Faster Segment Anything to Everything
Paper • 2312.09579 • Published • 24 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 31 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 6 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 33
-
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper • 2312.07987 • Published • 41 -
Interfacing Foundation Models' Embeddings
Paper • 2312.07532 • Published • 15 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
TheBloke/quantum-v0.01-GPTQ
Text Generation • 1B • Updated • 2 • 2
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 16 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 16