62 101 129

Ningyu Zhang PRO

Ningyu

https://person.zju.edu.cn/en/ningyu

zxlzr

AI & ML interests

NLP, Knowledge Editing

Recent Activity

upvoted a collection 5 days ago

Chat2Workflow

commented on a paper 8 days ago

From Data to Behavior: Predicting Unintended Model Behaviors Before Training

authored a paper 9 days ago

From Data to Behavior: Predicting Unintended Model Behaviors Before Training

View all activity

Organizations

upvoted a collection 5 days ago

Chat2Workflow

Collection

1 item • Updated 6 days ago • 1

upvoted a paper 11 days ago

From Data to Behavior: Predicting Unintended Model Behaviors Before Training

Paper • 2602.04735 • Published 11 days ago • 15

upvoted a paper 12 days ago

Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics

Paper • 2602.02343 • Published 13 days ago • 13

upvoted a paper 26 days ago

Aligning Agentic World Models via Knowledgeable Experience Learning

Paper • 2601.13247 • Published 27 days ago • 15

upvoted 2 papers about 1 month ago

Can We Predict Before Executing Machine Learning Agents?

Paper • 2601.05930 • Published Jan 9 • 27

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Paper • 2601.05905 • Published Jan 9 • 20

upvoted a paper 2 months ago

InnoGym: Benchmarking the Innovation Potential of AI Agents

Paper • 2512.01822 • Published Dec 1, 2025 • 36

upvoted a collection 4 months ago

Memory

Collection

Prompt is text-based memory. System II prompting is updating memory. Parametric memory is long-term, while prompt-based are short-tem. • 23 items • Updated Oct 22, 2025 • 2

upvoted 2 papers 4 months ago

LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published Oct 21, 2025 • 114

Executable Knowledge Graphs for Replicating AI Research

Paper • 2510.17795 • Published Oct 20, 2025 • 15

upvoted an article 4 months ago

Article

🛠 ML-Agents Tips & Lessons Learned (AutoMind + MLE-Bench)

Oct 9, 2025

•

upvoted a paper 4 months ago

When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation

Paper • 2510.07238 • Published Oct 8, 2025 • 15

upvoted 4 papers 5 months ago

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

Paper • 2510.00232 • Published Sep 30, 2025 • 16

upvoted 2 papers 6 months ago

Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL

Paper • 2508.07976 • Published Aug 11, 2025 • 52

Memp: Exploring Agent Procedural Memory

Paper • 2508.06433 • Published Aug 8, 2025 • 36

upvoted a collection 7 months ago

DataMind

Collection

9 items • Updated Oct 11, 2025 • 3

upvoted a paper 7 months ago

Automating Steering for Safe Multimodal Large Language Models

Paper • 2507.13255 • Published Jul 17, 2025 • 4

Ningyu Zhang PRO

AI & ML interests

Recent Activity

Organizations

Ningyu's activity

🛠 ML-Agents Tips & Lessons Learned (AutoMind + MLE-Bench)