Skip to main content
Registration is now open! Early-bird pricing available through May 5, 2026. Register now

All Accepted Papers

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

Léo Boisvert (ServiceNow Research, Mila -Quebec AI institute, Polytechnique Montréal), Abhay Puri (ServiceNow Research), Chandra Kiran Reddy Evuru (ServiceNow), Nazanin Mohammadi Sepahvand (ServiceNow Research), Nicolas Chapados (Mila -Quebec AI institute, Polytechnique Montréal), Quentin Cappart (Polytechnique Montréal), Alexandre Lacoste (ServiceNow Research), Krishnamurthy Dvijotham (ServiceNow Research), Alexandre Drouin (ServiceNow Research)

Security & Privacy

Abstract

While finetuning AI agents on interaction data—such as web browsing or tool use—improves their capabilities, it also introduces critical security vulnerabilities within the agentic AI supply chain. We show that adversaries can effectively poison the data collection pipeline at multiple stages to embed hard-to-detect backdoors that, when triggered, cause unsafe or malicious behavior. We formalize three realistic threat models across distinct layers of the supply chain: direct poisoning of finetuning data, pre-backdoored base models, and environment poisoning, a novel attack vector that exploits vulnerabilities specific to agentic training pipelines. Evaluated on two widely adopted agentic benchmarks, all three threat models prove effective: poisoning only a small number of demonstrations is sufficient to embed a backdoor that causes an agent to leak confidential user information with over 80\% success. Furthermore, we demonstrate that prominent safeguards, including four guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings expose an urgent and underexplored threat to agentic AI development, underscoring the need for rigorous security vetting of data collection pipelines and model supply chains.

ACM CAIS 2026 Sponsors