5 20 21

Zuhao Yang

mwxely

https://mwxely.github.io/

AI & ML interests

Large Multimodal Models

Recent Activity

upvoted a paper about 16 hours ago

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

liked a dataset 10 days ago

veggiebird/MATPO-data

upvoted a paper 12 days ago

On the Role of Discreteness in Diffusion LLMs

View all activity

Organizations

upvoted a paper about 16 hours ago

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published 1 day ago • 89

liked a dataset 10 days ago

veggiebird/MATPO-data

Viewer • Updated Oct 8, 2025 • 20k • 314 • 2

upvoted a paper 12 days ago

On the Role of Discreteness in Diffusion LLMs

Paper • 2512.22630 • Published 19 days ago • 17

upvoted a paper 13 days ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 15 days ago • 251

upvoted a paper 16 days ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published Dec 9, 2025 • 130

upvoted a paper 17 days ago

EgoX: Egocentric Video Generation from a Single Exocentric Video

Paper • 2512.08269 • Published Dec 9, 2025 • 117

upvoted 2 papers 24 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published 27 days ago • 65

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published 24 days ago • 63

upvoted 3 papers about 1 month ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 271

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19, 2025 • 229

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 211

New activity in mwxely/TransitBench about 1 month ago

[bot] Conversion to Parquet

#1 opened 6 months ago by

parquet-converter

When do you release the code?

#2 opened 5 months ago by

zhangzb

authored a paper about 1 month ago

A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models

Paper • 2511.15098 • Published Nov 19, 2025

updated 3 datasets about 1 month ago

liked a dataset about 1 month ago

longvideotool/VideoSIAH-Eval

Viewer • Updated Dec 10, 2025 • 1.28k • 163 • 2

published a dataset about 1 month ago

longvideotool/VideoSIAH-Eval

Viewer • Updated Dec 10, 2025 • 1.28k • 163 • 2

commented a paper about 1 month ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 182 •