Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Chevolier 's Collections
Recommendation
VLA
Video Generation
Multimodal
LLM
Agent

Multimodal

updated 7 days ago
Upvote
-

  • MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

    Paper • 2510.08540 • Published Oct 9 • 108

  • Diffusion Transformers with Representation Autoencoders

    Paper • 2510.11690 • Published 28 days ago • 160

  • Spotlight on Token Perception for Multimodal Reinforcement Learning

    Paper • 2510.09285 • Published Oct 10 • 36

  • Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation

    Paper • 2510.17354 • Published 21 days ago • 33

  • RL makes MLLMs see better than SFT

    Paper • 2510.16333 • Published 24 days ago • 47

  • ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

    Paper • 2510.27492 • Published 11 days ago • 78
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs