We introduce AdInject, a novel, real-world black-box attack method that leverages internet advertising delivery to inject malicious content into Web Agents’ environments, misleading them into clicking ads with high success rates (often >60%, sometimes approaching 100%).
Vision-Language Model (VLM) based Web Agents represent a significant step towards automating complex tasks by simulating human-like interaction with websites. However, their deployment in uncontrolled web environments introduces significant security vulnerabilities. Existing research on adversarial environmental injection attacks often relies on unrealistic assumptions, such as direct HTML manipulation, knowledge of user intent, or access to agent model parameters, limiting their practical applicability. In this paper, we propose AdInject, a novel and real-world black-box attack method that leverages the internet advertising delivery to inject malicious content into the Web Agent’s environment. AdInject operates under a significantly more realistic threat model than prior work, assuming a black-box agent, static malicious content constraints, and no specific knowledge of user intent. AdInject includes strategies for designing malicious ad content aimed at misleading agents into clicking, and a VLM-based ad content optimization technique that infers potential user intents from the target website’s context and integrates these intents into the ad content to make it appear more relevant or critical to the agent’s task, thus enhancing attack effectiveness. Experimental evaluations demonstrate the effectiveness of AdInject, attack success rates exceeding 60% in most scenarios and approaching 100% in certain cases. This strongly demonstrates that prevalent advertising delivery constitutes a potent and real-world vector for environment injection attacks against Web Agents. This work highlights a critical vulnerability in Web Agent security arising from real-world environment manipulation channels, underscoring the urgent need for developing robust defense mechanisms against such threats.
Our research on AdInject revealed several critical aspects of Web Agent vulnerabilities:
AdInject’s methodology focuses on misleading a Web Agent into clicking a malicious ad, adhering to a realistic threat model.
The core principle is to make the agent perceive clicking the ad as a necessary step to complete its task.
To enhance effectiveness, AdInject employs a VLM to optimize ad content:
AdInject was evaluated on VisualWebArena and OSWorld benchmarks using various Web Agents and settings.
The primary experiments, using default-sized pop-up style ads without content optimization, demonstrated significant attack success rates.
Table 1: Main Results on VisualWebArena (Partial) (Corresponds to Table 1 in the AdInject paper)
Agent | Model | Setting | ||||
---|---|---|---|---|---|---|
Basic Agent | GPT-4o | A11y Tree | 73.15 | 1.45 | 27.32 | 25.93 |
A11y Tree + Screen | 93.51 | 1.00 | 45.83 | 44.90 | ||
Set-of-Marks | 93.99 | 1.75 | 18.51 | 25.93 | ||
Basic Agent | Claude-3.7 | A11y Tree | 37.92 | 2.74 | 30.56 | 20.38 |
A11y Tree + Screen | 66.67 | 2.42 | 45.38 | 33.33 | ||
Set-of-Marks | 53.24 | 8.50 | 16.67 | 20.83 |
These results show high ASRs, especially for GPT-4o, indicating the base AdInject method is highly effective at inducing unwanted clicks.
The VLM-based ad content optimization further improved attack effectiveness.
Table 2: Results of Ad Content Optimization (Partial) (Corresponds to Table 3 in the AdInject paper)
Model | Setting | |||
---|---|---|---|---|
GPT-4o | A11y Tree | 73.15 | 1.45 | 27.32 |
A11y Tree w/ Optimize | 79.17 | 1.29 | 25.00 | |
A11y Tree + Screen | 93.51 | 1.00 | 45.83 | |
A11y Tree + Screen w/ Optimize | 94.90 | 1.03 | 43.06 | |
Claude-3.7 | A11y Tree | 37.92 | 2.74 | 30.56 |
A11y Tree w/ Optimize | 63.89 | 2.28 | 31.49 | |
A11y Tree + Screen | 66.67 | 2.42 | 45.38 | |
A11y Tree + Screen w/ Optimize | 77.32 | 1.18 | 38.43 |
Optimization consistently increased ASR and often reduced the steps needed for the agent to click the ad, demonstrating the value of tailoring ad content.
AdInject’s core design principle significantly outperformed other ad content strategies.
Table 3: Results of Baseline Comparison (Partial, VisualWebArena, A11y Tree + Screen) (Corresponds to Table 4 in the AdInject paper)
Model | Ad Setting | |||
---|---|---|---|---|
GPT-4o | Vanilla | 0.00 | - | 45.83 |
Injection | 0.00 | - | 41.67 | |
Virus | 20.83 | 3.14 | 42.13 | |
Speculate | 4.17 | 5.33 | 39.82 | |
Ours | 93.51 | 1.00 | 45.83 | |
Claude-3.7 | Vanilla | 0.00 | - | 36.57 |
Injection | 0.00 | - | 44.90 | |
Virus | 1.39 | 13.33 | 43.06 | |
Speculate | 3.24 | 8.14 | 45.83 | |
Ours | 66.67 | 2.42 | 45.38 |
The 0.00% ASR for ‘Vanilla’ ads confirms clicks are attack-induced. AdInject’s strategy of framing the ad click as necessary for task completion is markedly more effective.
Even with defensive prompts, AdInject maintained notable effectiveness.
Table 4: Results of Defense Experiments (Partial, VisualWebArena, Basic Agent GPT-4o, A11y Tree + Screen) (Corresponds to Table 7 in the AdInject paper)
Position | Defense Level | |||
---|---|---|---|---|
- | None | 93.51 | 1.00 | 45.83 |
Goal | 1 (Generic) | 93.51 | 1.01 | 38.89 |
2 (Ads) | 92.60 | 1.03 | 39.82 | |
3 (Specific) | 56.94 | 1.09 | 46.29 | |
System | 1 (Generic) | 93.99 | 1.02 | 47.22 |
2 (Ads) | 92.60 | 1.06 | 50.00 | |
3 (Specific) | 89.35 | 1.22 | 51.85 |
Generic warnings were ineffective. Only specific warnings (Level 3), particularly when placed in the Goal prompt, reduced ASR, but the attack still succeeded in over half the cases.
In this paper, we introduce AdInject, a real-world black-box attack method targeting VLM-based Web Agents. Leveraging the internet advertising delivery, AdInject injects malicious content under a strict threat model, avoiding unrealistic assumptions of prior works. Our experimental results on VisualWebArena and OSWorld demonstrate the significant effectiveness of AdInject, achieving high attack success rates, often exceeding 60% and approaching 100% in certain scenarios. This work reveals a critical security vulnerability in Web Agents stemming from realistic environment manipulation channels, underscoring the urgent need for developing robust defense mechanisms against such practical threats.
@misc{wang2025adinjectrealworldblackboxattacks,
title={AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery},
author={Haowei Wang and Junjie Wang and Xiaojun Jia and Rupeng Zhang and Mingyang Li and Zhe Liu and Yang Liu and Qing Wang},
year={2025},
eprint={2505.21499},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2505.21499},
}