ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks Paper • 2502.05352 • Published Feb 7, 2025 • 2
view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 4 days ago • 19
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 10 items • Updated 1 day ago • 14
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality 26 days ago • 31
From Benchmarks to Business Impact: Deploying IBM Generalist Agent in Enterprise Production Paper • 2510.23856 • Published Oct 27, 2025 • 5
view article Article Granite Embedding R2: Setting New Standards for Enterprise Retrieval Oct 14, 2025 • 16