Lan Chen's picture

3 23 2

Lan Chen

Orannue

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

upvoted a paper 6 days ago

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

upvoted a paper 6 days ago

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

View all activity

Organizations

upvoted a paper 3 days ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published 3 days ago • 30

upvoted 2 papers 6 days ago

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Paper • 2601.22060 • Published 10 days ago • 149

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published 6 days ago • 124

upvoted 3 papers about 1 month ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 46

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 130

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Paper • 2512.25075 • Published Dec 31, 2025 • 15

upvoted 4 papers about 2 months ago

IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning

Paper • 2512.15635 • Published Dec 17, 2025 • 20

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Paper • 2512.14614 • Published Dec 16, 2025 • 71

KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published Dec 15, 2025 • 43

OmniPSD: Layered PSD Generation with Diffusion Transformer

Paper • 2512.09247 • Published Dec 10, 2025 • 47

upvoted 3 papers 2 months ago

Generative Neural Video Compression via Video Diffusion Prior

Paper • 2512.05016 • Published Dec 4, 2025 • 10

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action

Paper • 2511.22134 • Published Nov 27, 2025 • 22

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 236

upvoted 2 papers 4 months ago

UniMIC: Token-Based Multimodal Interactive Coding for Human-AI Collaboration

Paper • 2509.22570 • Published Sep 26, 2025 • 4

UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models

Paper • 2509.21760 • Published Sep 26, 2025 • 15

upvoted a paper 8 months ago

CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios

Paper • 2506.13977 • Published Jun 11, 2025 • 10

upvoted a paper 9 months ago

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Paper • 2505.22019 • Published May 28, 2025 • 11

upvoted 2 papers 10 months ago

DreamO: A Unified Framework for Image Customization

Paper • 2504.16915 • Published Apr 23, 2025 • 24

Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Paper • 2504.05594 • Published Apr 8, 2025 • 11

upvoted a paper 11 months ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25, 2025 • 73