Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
PeterJinGo 's Collections
Search-R1-v0.3
Search-R1-v0.2
Search-R1

Search-R1

updated Aug 12

Preliminary checkpoints with outcome-only RL.

Upvote
12

  • Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

    Paper • 2503.09516 • Published Mar 12 • 36

  • PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-em-ppo

    4B • Updated Mar 12 • 7

  • PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-em-grpo

    4B • Updated Mar 12 • 4

  • PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-it-em-ppo

    4B • Updated Mar 12 • 2

  • PeterJinGo/SearchR1-nq_hotpotqa_train-llama3.2-3b-it-em-grpo

    4B • Updated Mar 12 • 8

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo

    3B • Updated Mar 12 • 126

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo

    3B • Updated Mar 12 • 91

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-ppo

    3B • Updated Mar 12 • 22 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-grpo

    3B • Updated Mar 12 • 36

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo

    8B • Updated Mar 21 • 2.58k • 3

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-ppo

    8B • Updated Mar 12 • 60

  • PeterJinGo/wiki-18-corpus

    Updated Feb 26 • 3.51k

  • PeterJinGo/wiki-18-e5-index

    Updated Feb 26 • 1.69k

  • PeterJinGo/nq_hotpotqa_train

    Viewer • Updated Mar 13 • 221k • 1.24k • 8

  • PeterJinGo/LICENCE

    Viewer • Updated Aug 12 • 202 • 23
Upvote
12
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs