Events
Events
All EventsRegister Now
Summit
Wednesday, July 22, 2026San Francisco, CA$199250 attendees
About this event
Evaluating agents is one of the hardest problems in the space. This one-day summit brings together the teams building evaluation frameworks — from task-specific benchmarks to end-to-end reliability metrics.
Hear from the creators of SWE-bench, GAIA, and AgentBench on what they've learned. Workshop sessions cover building custom evals, A/B testing agent behaviors, and measuring safety properties. If you're shipping agents to production, this is how you know they work.
Don't miss this event
Register on the organizer's website.
Hosting an AI agent event? Submit it for free or get featured placement for $300.