Events
All Events
Summit

Agent Evaluation & Benchmarking Summit

Scale AI
Wednesday, July 22, 2026San Francisco, CA$199250 attendees

About this event

Evaluating agents is one of the hardest problems in the space. This one-day summit brings together the teams building evaluation frameworks — from task-specific benchmarks to end-to-end reliability metrics.

Hear from the creators of SWE-bench, GAIA, and AgentBench on what they've learned. Workshop sessions cover building custom evals, A/B testing agent behaviors, and measuring safety properties. If you're shipping agents to production, this is how you know they work.

Don't miss this event

Register on the organizer's website.

Register Now

Hosting an AI agent event? Submit it for free or get featured placement for $300.

Copyright © 2026Agent Mag — All rights reserved