wuyuhao's picture

8 29 9

wuyuhao

mozhu

·

AI & ML interests

None yet

Recent Activity

authored a paper 8 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

upvoted a paper 9 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

upvoted a paper 21 days ago

Glyph: Scaling Context Windows via Visual-Text Compression

View all activity

Organizations

upvoted a paper 9 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 12 days ago • 101

upvoted a paper 21 days ago

Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published 22 days ago • 66

upvoted a paper about 1 month ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 67

upvoted a paper 3 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 189

upvoted a paper 4 months ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 237

upvoted 2 papers 5 months ago

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Paper • 2506.18841 • Published Jun 23 • 56

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

Paper • 2506.04180 • Published Jun 4 • 33

upvoted 9 papers 8 months ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 141

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Paper • 2503.10460 • Published Mar 13 • 29

Shifting Long-Context LLMs Research from Input to Output

Paper • 2503.04723 • Published Mar 6 • 22

Referring to Any Person

Paper • 2503.08507 • Published Mar 11 • 7

^RFLAV: Rolling Flow matching for infinite Audio Video generation

Paper • 2503.08307 • Published Mar 11 • 9

REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding

Paper • 2503.07413 • Published Mar 10 • 2

What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization

Paper • 2503.06698 • Published Mar 9 • 4

NeuGrasp: Generalizable Neural Surface Reconstruction with Background Priors for Material-Agnostic Object Grasp Detection

Paper • 2503.03511 • Published Mar 5 • 2

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation

Paper • 2503.06594 • Published Mar 9 • 6

upvoted 3 papers 9 months ago

LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models

Paper • 2502.14834 • Published Feb 20 • 24

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 165

LongGenBench: Long-context Generation Benchmark

Paper • 2410.04199 • Published Oct 5, 2024 • 22

upvoted a paper 10 months ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298