Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arXiv:2501.08313

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 4
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 187

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 15.9k • 171
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 2.94k • 540
Salesforce/Webscale-RL

Viewer • Updated 26 days ago • 1.11M • 9.86k • 79

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18 • 1
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 58.6k • 198
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated Jul 28 • 46.5k • 396
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 11.9k • 118

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 7 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 307
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 209

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • 456B • Updated Jul 9 • 10.5k • 8
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • 456B • Updated Jul 9 • 101 • 6
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11 • 113 • 10
MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 3k • 649

MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 3k • 649
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • 456B • Updated Jul 3 • 83k • 280
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Running

117

117

MiniMaxText01

💬

Generate responses to text and images in a chat interface

Running

11

11

Inpaint mask maker

👺

Swap faces in images with adjustments
deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 245k • • 3.08k
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 7 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 4
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 187

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 307
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 209

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 15.9k • 171
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 2.94k • 540
Salesforce/Webscale-RL

Viewer • Updated 26 days ago • 1.11M • 9.86k • 79

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • 456B • Updated Jul 9 • 10.5k • 8
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • 456B • Updated Jul 9 • 101 • 6
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11 • 113 • 10
MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 3k • 649

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18 • 1
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 58.6k • 198
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated Jul 28 • 46.5k • 396
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 11.9k • 118

MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 3k • 649
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • 456B • Updated Jul 3 • 83k • 280
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
Running

117

117

MiniMaxText01

💬

Generate responses to text and images in a chat interface

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141

Running

11

11

Inpaint mask maker

👺

Swap faces in images with adjustments
deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 245k • • 3.08k
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298

Previous
1
2
3
...
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs