1 16 5

Siwei Wen

lingcco

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

updated a model about 1 month ago

lingcco/cooking-qwen2.5-7b_v2

updated a model about 1 month ago

lingcco/trans_harv-qwen2.5-7b_v2

View all activity

Organizations

None yet

upvoted a paper 16 days ago

OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

Paper • 2510.26213 • Published 17 days ago • 9

upvoted a paper about 2 months ago

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 131

upvoted a paper 3 months ago

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Paper • 2508.09987 • Published Aug 13 • 25

upvoted 2 papers 4 months ago

EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity

Paper • 2507.21848 • Published Jul 29 • 8

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Paper • 2507.11097 • Published Jul 15 • 64

upvoted a paper 5 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

upvoted 2 papers 6 months ago

Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29 • 68

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

upvoted 2 papers 7 months ago

TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning

Paper • 2504.09641 • Published Apr 13 • 16

FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

Paper • 2504.09925 • Published Apr 14 • 38

upvoted 3 papers 8 months ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published Apr 3 • 57

Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation

Paper • 2503.14905 • Published Mar 19 • 20

LEGION: Learning to Ground and Explain for Synthetic Image Detection

Paper • 2503.15264 • Published Mar 19 • 21

upvoted a paper 9 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 154

upvoted 2 papers 11 months ago

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 98

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3, 2024 • 95

Siwei Wen

AI & ML interests

Recent Activity

Organizations

lingcco's activity