Anthropic, Google, and Microsoft paid AI agent bug bounties, then kept quiet about the flaws

Anthropic, Google, and Microsoft Paid Bug Bounties for AI Agent Flaws

In short: Security researcher Aonan Guan successfully hijacked AI agents from Anthropic, Google, and Microsoft via prompt injection attacks on their GitHub Actions integrations, stealing API keys and tokens in each case.

All three companies quietly paid bug bounties, with Anthropic offering $100, GitHub $500 (an undisclosed amount was offered by Google), but none issued public advisories or assigned CVEs, leaving users on older versions unaware of the risks.

Security researchers have demonstrated that AI agents from Anthropic, Google, and Microsoft can be hijacked through prompt injection attacks to steal API keys, GitHub tokens, and other secrets. The vulnerabilities, disclosed by researcher Aonan Guan over several months, affect AI tools integrating with GitHub Actions: Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub’s Copilot Agent.

How the Attacks Work

The core technique involves indirect prompt injection. Instead of targeting the AI models directly, Guan embedded malicious instructions in areas where these agents were designed to trust: PR titles, issue descriptions, and comments. When ingested by the agent as part of its workflow, these injected commands were executed as legitimate instructions.

  • Anthropic’s Claude Code Security Review: Guan crafted a PR title containing a prompt injection payload. Claude executed the embedded commands and exposed leaked credentials in its JSON response, which was then posted as a PR comment for anyone to view. The attack allowed exfiltration of Anthropic's API key, GitHub access tokens, and other secrets from the GitHub Actions runner environment.
  • Google’s Gemini: A fake “trusted content section” was injected after legitimate content in a GitHub issue, overriding Gemini’s safety instructions and tricking it into publishing its own API key as an issue comment. Google's Gemini CLI Action treated the injected text as authoritative.
  • GitHub’s Copilot: Guan hid malicious instructions inside an HTML comment in a GitHub issue, making them invisible to humans but fully visible to the AI agent parsing raw content. When a developer assigned the issue to Copilot Agent, it followed the hidden instructions without question.

The Quiet Fix

Following these disclosures, Anthropic acknowledged the bug bounty submissions on its HackerOne platform. However, as of writing, neither Google nor Microsoft has publicly acknowledged or addressed these vulnerabilities.