ShareGPT4Video

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Lin-Chen authored a paper about 1 month ago

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

LanguageBind authored a paper 3 months ago

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning

LanguageBind authored a paper 3 months ago

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

View all activity

Lin-Chen

authored a paper about 1 month ago

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Paper • 2511.22134 • Published Nov 27, 2025 • 21

LanguageBind

authored 5 papers 3 months ago

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning

Paper • 2507.03019 • Published Jul 2, 2025 • 1

Can Understanding and Generation Truly Benefit Together -- or Just Coexist?

Paper • 2509.09666 • Published Sep 11, 2025 • 34

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation

Paper • 2509.25187 • Published Sep 29, 2025 • 2

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

Paper • 2510.11026 • Published Oct 13, 2025 • 17

Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback

Paper • 2510.16888 • Published Oct 19, 2025 • 21

Lin-Chen

authored a paper 3 months ago

Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models

Paper • 2510.01304 • Published Oct 1, 2025 • 10

Wiselnn

authored a paper 3 months ago

SIM-CoT: Supervised Implicit Chain-of-Thought

Paper • 2509.20317 • Published Sep 24, 2025 • 41

Jinsong-Li

authored 4 papers 5 months ago

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Paper • 2502.08590 • Published Feb 12, 2025 • 42

Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings

Paper • 2506.04997 • Published Jun 5, 2025

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published Jun 24, 2025 • 26

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1, 2025 • 62

LanguageBind

authored 7 papers 7 months ago

Next Patch Prediction for Autoregressive Visual Generation

Paper • 2412.15321 • Published Dec 19, 2024 • 1

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Paper • 2412.00397 • Published Nov 30, 2024

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Paper • 2503.07265 • Published Mar 10, 2025 • 4

SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video

Paper • 2503.09154 • Published Mar 12, 2025

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Paper • 2505.20292 • Published May 26, 2025 • 52

ImgEdit: A Unified Image Editing Dataset and Benchmark

Paper • 2505.20275 • Published May 26, 2025 • 18

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 58

Lin-Chen

authored a paper 7 months ago

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Paper • 2505.22019 • Published May 28, 2025 • 11

AI & ML interests

Recent Activity

Team members 4

ShareGPT4Video's activity