-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arXiv:2501.08313
-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 4 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 187
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Agent-Ark/Toucan-1.5M
Viewer • Updated • 1.65M • 15.9k • 171 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 2.94k • 540 -
Salesforce/Webscale-RL
Viewer • Updated • 1.11M • 9.86k • 79
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 58.6k • 198 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 46.5k • 396 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 11.9k • 118
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 307 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 209
-
MiniMaxAI/MiniMax-Text-01-hf
Text Generation • 456B • Updated • 10.5k • 8 -
MiniMaxAI/MiniMax-M1-80k-hf
Text Generation • 456B • Updated • 101 • 6 -
MiniMaxAI/MiniMax-M1-40k-hf
Text Generation • Updated • 113 • 10 -
MiniMaxAI/MiniMax-Text-01
Text Generation • 456B • Updated • 3k • 649
-
MiniMaxAI/MiniMax-Text-01
Text Generation • 456B • Updated • 3k • 649 -
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text • 456B • Updated • 83k • 280 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
117
MiniMaxText01
💬Generate responses to text and images in a chat interface
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 4 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 187
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 307 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 209
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
Agent-Ark/Toucan-1.5M
Viewer • Updated • 1.65M • 15.9k • 171 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 2.94k • 540 -
Salesforce/Webscale-RL
Viewer • Updated • 1.11M • 9.86k • 79
-
MiniMaxAI/MiniMax-Text-01-hf
Text Generation • 456B • Updated • 10.5k • 8 -
MiniMaxAI/MiniMax-M1-80k-hf
Text Generation • 456B • Updated • 101 • 6 -
MiniMaxAI/MiniMax-M1-40k-hf
Text Generation • Updated • 113 • 10 -
MiniMaxAI/MiniMax-Text-01
Text Generation • 456B • Updated • 3k • 649
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 58.6k • 198 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 46.5k • 396 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 11.9k • 118
-
MiniMaxAI/MiniMax-Text-01
Text Generation • 456B • Updated • 3k • 649 -
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text • 456B • Updated • 83k • 280 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
117
MiniMaxText01
💬Generate responses to text and images in a chat interface
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 141