Registration has reached capacity. Join the waitlist

Arena: Benchmarking AI Agent Frameworks Under Fixed-Model Conditions

Roberto Milev (Navan), Uday Kanagala (Navan)

Evaluation & Benchmarking

An open-source benchmarking tool that evaluates agent frameworks under fixed-model conditions, finding that scenario-specific orchestration adds no measurable benefit over generic agentic loops.

Presentation

Demo session

Friday, May 29 · 1:45 PM – 3:15 PM

San Jose / Santa Clara

View day schedule

Description

Existing agent benchmarks evaluate models, not the frameworks that orchestrate them, making it impossible to isolate how much performance comes from the model versus the framework's orchestration code. We present Arena, an open-source benchmarking tool that evaluates agent frameworks under fixed-model conditions. Arena fixes six frameworks — Claude Agent SDK, LangChain, LangGraph, AWS Strands, CrewAI, and Google ADK — to Claude Sonnet 4.5 on AWS Bedrock, connects them to the same MCP tool server, and scores them with a deterministic evaluator across three scenarios of increasing complexity using six metrics: code complexity, step efficiency, latency, correctness, consistency, and cost. We ask: does explicitly programming agent flows provide measurable benefit over a generic agentic loop driven by prompts? Our evaluation reveals that on simple tasks all frameworks perform comparably, but as complexity grows, traditional frameworks require 2–4× more scenario-specific orchestration code yet gain no correctness advantage. The Claude Agent SDK uses the same generic agentic loop across all scenarios; only the prompt changes. We contribute (1) a fixed-model methodology isolating framework behavor from model capability, (2) an extensible open-source tool for practitioner evaluation, and (3) empirical evidence that scenario-specific orchestration adds no measurable benefit over generic agentic loops driven by prompts.

Artifacts & Links

                        Authors
                        Roberto Milev
Navan
Uday Kanagala
Navan