Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Osilly 's Collections
Vision-DeepResearch
Dynamic-LLaVA
Interleaving Reasoning Generation
Vision-R1

Vision-DeepResearch

updated 2 days ago
Upvote
3

  • Osilly/Vision-DeepResearch-Toy-SFT-Data

    Viewer • Updated 4 days ago • 1k • 35

  • Osilly/Vision-DeepResearch-Toy-RL-Data

    Viewer • Updated 4 days ago • 1k • 23

  • Osilly/VDR-Bench

    Viewer • Updated 4 days ago • 2k • 41

  • Osilly/VDR-Bench-testmini

    Viewer • Updated 4 days ago • 500 • 19

  • Osilly/Vision-DeepResearch-8B

    9B • Updated 4 days ago • 23 • 3

  • Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

    Paper • 2601.22060 • Published 7 days ago • 143

  • Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

    Paper • 2602.02185 • Published 3 days ago • 122
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs