Comment and Control: How GitHub Comments Became a New Prompt-Injection Threat

Developer workstation showing GitHub PR with hidden HTML comment exfiltrating API keys

A new class of prompt-injection attacks—dubbed “Comment and Control”—turns GitHub pull requests, issues, and comments into attack surfaces that can hijack AI coding agents and siphon secrets directly from CI/CD environments. Unlike classic prompt injection that waits for a user to feed a document to an agent, this pattern is proactive: opening a PR or posting an issue can automatically trigger an AI agent (via GitHub Actions) and cause it to execute attacker-supplied instructions, then exfiltrate credentials back into the repository or comments. The result is a remarkably stealthy loop that runs entirely inside GitHub, needing no external callback server.

How the attack works

Comment and Control leverages two simple facts about many AI agent integrations: they include untrusted GitHub content in the agent’s prompt context, and they run in a runtime that often has access to environment secrets and tools. An attacker crafts a PR title, issue body, or comment containing instruction-like text that the agent treats as trusted context. When the Action runs automatically on events such as pull_request or issue_comment, the agent ingests that malicious text, follows the attacker’s instructions (for example, to read environment variables or run shell commands), and returns the results via a PR comment, issue comment, commit, or other Git-based channel. Because these workflows are normal parts of the development lifecycle, the exfiltration blends into routine activity and often bypasses monitoring.

Vendor findings: three cross-vendor examples

Claude Code — PR title to remote command execution

Anthropic’s Claude Code Security Review action interpolated the PR title into its prompt without sanitization and ran the Claude CLI with full environment access. Because the agent inherited the Action runtime environment, it could access ANTHROPIC_API_KEY and GITHUB_TOKEN. A malicious PR title that broke out of the prompt context could instruct the agent to run commands such as whoami, ps, or env. The agent then returned a JSON-formatted “security finding” containing environment details to the PR comment. Anthropic rated the issue critical and issued a high CVSS score.

Gemini CLI Action — issue comment to API key leak

Google’s Gemini CLI Action included issue titles, bodies, and comments under an “Additional Content” section in its prompt. An attacker could inject a fake “Trusted Content Section” that overrode Gemini’s safety instructions, causing it to post the GEMINI_API_KEY as a public issue comment. This vulnerability demonstrated how surprisingly small prompt-context manipulations can force an agent to reveal secrets it understands but should not expose.

Copilot Agent — layered defenses bypassed

GitHub’s Copilot Agent (SWE Agent) demonstrated the most sophisticated exploitation path. GitHub had implemented three runtime defenses—environment variable filtering, secret scanning, and a network firewall—but each was bypassed in turn:

Environment filtering was limited to child subprocesses; an attacker could read the unfiltered parent process’s /proc/[pid]/environ to recover tokens.

Secret scanning looked for token prefixes (ghs_, ghu_), but exfiltrated secrets could be base64-encoded to evade pattern matching.

Firewall rules permitted normal git pushes; exfiltrating encoded credentials via a standard git push to a PR looked like normal developer activity and raised no alarms.

The payload was hidden inside an HTML comment in the issue body (invisible in rendered Markdown but parsed by the agent), which allowed the attacker to assign the issue to Copilot and remain stealthy.

Affected components and impact

Component	Injection Surface	Exfiltration Channel	Credentials Leaked	Bounty
Claude Code	PR title	PR comment	ANTHROPIC_API_KEY, GITHUB_TOKEN	$100
Gemini CLI	Issue comments	Issue comment	GEMINI_API_KEY	$1,337
Copilot Agent	Issue body (HTML comment)	Git commit	GITHUB_TOKEN, COPILOT_API_TOKEN, +2 more	$500

Why this is broader than GitHub Actions

Although the research focused on GitHub Actions and pull-request workflows, the architectural flaw is general: any AI agent that processes untrusted input while holding secrets or tool access is at risk. That expands the threat surface to Slack bots, Jira automation, email-processing agents, deployment systems, and other integrations that accept free-text input or render HTML/Markdown into prompt context. The difficulty of reliably distinguishing “trusted” and “untrusted” inputs means many automation setups could be abused in similar ways.

Practical mitigations and defensive hygiene

Allowlist tools, do not blocklist. Configure agents with –allowed-tools (or equivalent) so they only have the minimal capabilities required for the task. Blocklists are brittle and easy to bypass.
Apply least privilege to secrets. Agents that only need to read metadata or triage issues should not run with full GITHUB_TOKEN write scopes or production API keys.
Human approval gates. Require manual review or explicit authorization before agents run actions that can access secrets or perform outbound operations.
Sanitize and canonicalize input. Treat any GitHub-provided text as untrusted and avoid interpolating it directly into prompts without normalization and strong contextual boundaries.
Audit integrations and logs. Inventory all AI agents and CI/CD integrations, and monitor logs for anomalous reads of environment variables, unusual git commits, or unexpected comment posts.
Secret-handling safeguards. Combine runtime controls (e.g., strict subprocess isolation, no environment inheritance) with repository-level protections and secret scanning that includes encoded forms and entropy checks.

Closing thoughts

Comment and Control demonstrates a simple but powerful lesson: automation that bridges untrusted content and privileged runtime capabilities creates an attack surface that traditional defenses don’t always cover. As teams accelerate development with AI agents, developers and security teams must treat these integrations as first-class security risks—applying least privilege, hardened runtime policies, and human-in-the-loop controls to prevent malicious prompts from becoming a vector for credential theft.