Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sunblaze-ucb
's Collections
Intuitor
Intuitor
updated
Jun 25
Models in the paper "Learning to Reason without External Rewards"
Upvote
-
sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Aug 13
•
37
•
1
sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Aug 13
•
7
sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Aug 13
•
27
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Aug 13
•
11
sunblaze-ucb/Qwen3-14B-GRPO-MATH-1EPOCH
Text Generation
•
15B
•
Updated
Aug 13
•
22
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH
Text Generation
•
7B
•
Updated
Aug 13
•
1
sunblaze-ucb/Qwen2.5-3B-GRPO-MATH-1EPOCH
Text Generation
•
3B
•
Updated
Aug 13
•
5
sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH
Text Generation
•
2B
•
Updated
Aug 13
•
3
sunblaze-ucb/OLMo-2-7B-SFT-GRPO-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
2
sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH-SYSP
Text Generation
•
7B
•
Updated
Jun 24
•
4
sunblaze-ucb/Llama-3.2-3B-Instruct-Intuitor-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
47
sunblaze-ucb/Llama-3.2-3B-Instruct-GRPO-MATH-1EPOCH
Text Generation
•
4B
•
Updated
Jun 25
•
38
Upvote
-
Share collection
View history
Collection guide
Browse collections