BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published about 1 month ago • 34 • 3
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo Paper • 2508.18370 • Published Aug 25 • 3 • 2
Cyber-Zero: Training Cybersecurity Agents without Runtime Paper • 2508.00910 • Published Jul 29 • 8 • 2