Registration has reached capacity. Join the waitlist

CAMI: Practical Cost-Aware Agent-Guided Multi-Indexing for Semantic Retrieval

Adnan Qidwai (IBM Software Innovation Lab), Anand Eswaran (IBM Software Innovation Lab), Sonam Mishra (IBM Software Innovation Lab), Jaydeep Sen (IBM Software Innovation Lab), Sachindra Joshi (IBM Software Innovation Lab)

System Optimization & Efficiency Architectural Patterns & Composition

CAMI is a cost-aware retrieval system that uses an agent to intelligently select which semantic enrichments—hypothetical queries, summaries, paraphrases—to generate per document chunk at index time, optimizing retrieval quality within a practical cost budget. It avoids the combinatorial explosion of exhaustively generating all enrichment types across large corpora.

Presentation

Talk

Paper Session 3: Systems Efficiency

Wednesday, May 27 · 4:10 PM – 4:20 PM

Bayshore Ballroom

Poster

Wednesday, May 27 · 5:15 PM – 6:45 PM

Carmel / Monterey

View day schedule

Abstract

RAG ingestion pipelines frequently augment search corpus index with semantic enrichment indices (e.g., synthetic queries or summaries generated from corpus chunks) that are subsequently queried alongside the base index to improve retrieval via better alignment between document representations and user intent. While these supplementary representations substantially improve retrieval quality, they introduce a computational bottleneck: the configuration space of enrichment types and generator models is combinatorial, and the cost of exhaustive index-time evaluation scales linearly with corpus size. We introduce CAMI (Cost-Aware Multi-Indexing), a framework that formalizes multi-index construction as a budgeted, multi-objective portfolio selection problem. CAMI targets the upstream decision of which enrichment views to generate and materialize before the retrieval backend is applied. CAMI incorporates three primary mechanisms: (i) an agentic discovery phase that proposes corpus-specific representation templates; (ii) an atomic-unit search procedure that evaluates individual enrichment-model pairs and recombines them via fidelity-local closure to identify synergistic portfolios; and (iii) a confidence-aware promotion schedule that prunes unpromising configurations early, decoupling optimization spend from total corpus size. We evaluate CAMI across diverse retrieval corpora. Our findings reveal that the framework systematically isolates high-recall portfolios under strict budget constraints, outperforming standard content-only baselines in challenging settings by up to 9.4% recall@10. Further, CAMI is able to systematically identify these high-recall portfolios using up to 5x less budget compared to random search baselines, making our approach practical in real production scenarios.

Artifacts & Links

                        Authors
                        Adnan Qidwai
IBM Software Innovation Lab
Anand Eswaran
IBM Software Innovation Lab
Sonam Mishra
IBM Software Innovation Lab
Jaydeep Sen
IBM Software Innovation Lab
Sachindra Joshi
IBM Software Innovation Lab