Skip to main content
Registration is now open! Register now

All Accepted Papers

Open Agent Specification: Enabling Cross-Framework Comparison of AI Agents

Soufiane Amini (Oracle), Yassine Benajiba (Oracle), Cesare Bernardis (Oracle), Paul Cayet (Oracle), Hassan Chafi (Oracle), Abderrahim Fathan (Oracle), Louis Faucon (Oracle), Damien Hilloulin (Oracle), Sungpack Hong (Oracle), Ingo Kossyk (Oracle), Tirthankar Lahiri (Oracle), Tran Minh Son Le (Oracle), Rhicheek Patra (Oracle), Sujith Ravi (Oracle), Jonas Schweizer (Oracle), Jyotika Singh (Oracle), Shailender Singh (Oracle), Weiyi Sun (Oracle), Kartik Talamadupula (Oracle), Jerry Xu (Oracle)

Architectural Patterns & Composition

Open Agent Specification is a framework-agnostic declarative language for defining AI agents and multi-agent workflows, enabling portability and interoperability across agent frameworks. It provides common abstractions for control flow, data semantics, and tool integration so that workflows built in one framework can run in another.

Presentation

Talk

Paper Session 4: Agent Memory & Planning

Thursday, May 28 · 10:20 AM – 10:30 AM

Bayshore Ballroom

Poster

Thursday, May 28 · 4:30 PM – 6:00 PM

Carmel

Abstract

The proliferation of agent frameworks has made it difficult to evaluate AI agents fairly across runtimes, because frameworks differ in their abstractions, execution semantics, prompt handling, and tool integration mechanisms. We present a cross-framework evaluation methodology for agentic systems based on a shared declarative representation and a standardized evaluation harness. To enable this methodology, we introduce Open Agent Specification (Agent Spec), a framework-agnostic declarative language for representing AI agents and workflows with common components, control and data flow semantics, and validation support. Using Agent Spec as the common representation layer, we study four distinct runtimes (LangGraph, CrewAI, AutoGen, and WayFlow) across three different benchmarks (SimpleQA Verified, 𝜏2-Bench and BIRD-SQL). Our results show that, even when starting from the same agent specification, runtimes differ meaningfully in accuracy, latency, and execution behavior due to framework-specific differences. We also release supporting tooling, including a Python SDK (PyAgentSpec), a reference runtime (WayFlow), and adapters for popular frameworks (e.g., LangGraph, AutoGen, CrewAI), to make such cross-framework comparisons easier to reproduce and extend.

ACM CAIS 2026 Sponsors