The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)

When AI meets CI/CD: permission bypasses, prompt injection, and what to do about it.

Overview

AI-powered GitHub Actions from vendors like OpenAI, Anthropic, and Google are now running in thousands of public workflows. We set out to map the shared responsibility model for these actions and we found:

  • Bypasses of non-default configurations letting any external attacker trigger AI execution

  • A novel secret exfiltration vector for dynamically-created credential files, that models don't recognize as sensitive

  • Widespread misconfigurations in production workflows, affecting repositories with 200,000+ combined stars

In Part 1, we explored the traditional threat model of GitHub Actions and the last couple years of exploitation activity. This post extends the attack surface to account for prompt injection and AI tool capabilities - novel risks that traditional GitHub Actions security guidance doesn't address.

AI Actions - A New Attack Surface

What distinguishes AI actions? They differ from traditional GitHub Actions in two primary ways: (1) prompt injection as a fundamental risk of AI usage and (2) diverse capabilities and nondeterministic output. The incorporation of natural language and ambiguous input processing means that user input (such as a Pull Request body or an Issue description) can be highly rich and original. Concurrently, the probabilistic and manipulative nature of AI decision making makes the final output both unpredictable and creative. Yet, despite this expansion, the core concepts of the original threat model remain relevant because AI Actions must still operate within the constraints of the existing GitHub Actions architecture. Therefore the original Threat Model from Part 1 still applies:

AI actions within GitHub Actions threat model: from initial access to impact

Prompt injection is a recognized security concern for AI-powered GitHub Actions, as evidenced by multiple incidents ([1],[2],[3]). The fundamental assumption must be that if an attacker controls the input, a successful injection will occur. This leads to two primary security imperatives which were the focus of our research:

  1. Input Control: Ensuring that only actors from a trusted zone can control the input provided to the action.

  2. Blast Radius Limitation: Minimizing the potential impact or 'blast radius' - should a prompt injection successfully take place.

What we found

We used the above model as a guidance to perform the security analysis of the most popular AI GitHub Actions, including the following actions:

  • anthropics/claude-code-action (used in over 12K public workflows)

  • google-github-actions/run-gemini-cli (used in over 1K public workflows)

  • openai/codex-action

  • actions/ai-inference

External App security gate bypass 

External bot bypass (openai/codex-action, anthropics/claude-code-action)

There is a built-in permission check in the above actions. By default, only principals (users or Apps / bots) with Write access to the repository can trigger them. At the same time, both actions offer settings that relax this restrictive condition to allow either all users, all bots (Apps), or selected principals. For example, take the codex-action (before-the-fix) documentation:

Old codex-action documentation

However, a review of the implementation revealed that there is no access distinction between the internally installed App and external App. Here's why this matters: when you install a GitHub App you give it a set of required permissions on your repo or org; it becomes a part of a Trusted Zone. External GitHub Apps, however, should not enjoy these permissions and remain an external actor. openai/codex-action and anthropics/claude-action expanded the trust boundaries to ALL the Apps and bots. This mistake is evident in these code snippets - there is no check whether the App is allowed to act on the repository and instead the checks are purely syntactical:

// Vulnerable code in a write-check portion of a two-stage gate in claude-code-action/src/github/validation/permissions.ts: checkWritePermissions()
if (actor.endsWith("[bot]")) {
  core.info(`Actor is a GitHub App: ${actor}`);
  return true;}
// Vulnerable code in codex-action/src/checkActorPermissions.ts:69-74
if (allowBotActors && actor.endsWith("[bot]")) {
  core.info(`Actor '${actor}' is a bot account; skipping explicit permission check.`);
  return { status: "approved", actor }; }

What most people don't realize is that App tokens can create issues in other repos. As such, there are two bypass scenarios:

  • In case allowed_bots: * (in claude-code-action) or allow-bots: true (in codex-action) - an attacker bypasses the workflow gates by creating a random App and merely generating a token and acting on behalf of this app.

  • In case of a specific App permitted via allowed_bots: [<appname>] - an attacker can create an App with the exact name of the allowed bot, as long as it is not reserved or already taken: 

App naming: App name reserved vs available

Syntactical name comparison creates the reality where the App name is the single source of trust. If workflow trusts App anthropics-internal-bot and this name is for some reason available the attacker can take over the name and act on behalf of this App. This exploitation of name-based trust can lead to a new category of security flaws, which we term "Dangling GitHub Apps", akin to "Dangling Repos" or "Dangling Domains." 

Confused Deputy Bypass

Admittedly, for the previous bypass to work the workflow author needs to set a relaxed policy explicitly allowing all or some bots:

allow-bots: true #codex-action
allowed_bots: * #claude-code-action 
allowed_bots: [<bot-name-that-is-available>] #claude-code-action 

However, a more prevalent setting is granting dependabot access to claude-code-action (a quick search shows multiple workflows across popular projects with over 1K stars).

The danger here lies in a less-known technique called Dependabot Deputy Confusion Injection (well-outlined in this article from Boost Security), in which an attacker can trigger dependabot within GitHub Actions invocation, causing it to appear as the “github.actor” on an arbitrary PR. Instead of directly bypassing a conditional check like if:${{github.actor == 'dependabot[bot]'}}, an attacker can leverage the Confused Deputy technique to make dependabot the perceived author of a pull request or issue by triggering it via @dependabot recreate (for pull_request, pull_request_target, pull_request_review_comment triggers) and via @dependabot show (for issue_comment trigger).

The permission bypass is evident when comparing the workflow logs. You can see that in the failed first invocation #12 (user sshayb as an author) opposite to the successful #13 run (dependabot as an author):

Workflow runs - blocked vs bypassed

Also on the workflow log level:

Workflow logs - blocked vs bypassed

This bypasses allowed_bots entirely - another failure of ​​string comparison permission checks. Here is the full flow of the dependabot bypass:

Dependabot bypass flow

New Secret Exfiltration Vectors

Exfiltrating local credential files

AI models are hard to constrain. They can read files, call APIs, and execute tools - and there's no reliable way to prevent a manipulated model from misusing these capabilities. One prime example is the handling of secrets. While default action models like gemini2.5-pro, chatgpt4o, and claude-sonnet-4-5 are trained to be extremely cautious regarding the exfiltration of workflow-associated CI/CD secrets (referenced as ${{ secrets.* }}), not all secrets are introduced via GitHub Actions' native secrets structure. Some are created dynamically as local files. 

These dynamically created local secret files, if leaked or stolen, can be highly impactful, often more so than typical API keys because they frequently offer infrastructure level access. The following list, though likely incomplete, details common actions that generate such sensitive local secret files:

  • google-github-actions/auth@v2 - Creates ./gha-creds-*.json containing GCP service account keys

  • aws-actions/configure-aws-credentials - Creates ~/.aws/credentials containing AWS access keys

  • azure/login - Creates ~/.azure/ directory containing Azure service principal credentials

  • docker/login-action - Creates ~/.docker/config.json containing registry auth tokens

  • actions/checkout (with deploy key) - Creates ~/.ssh/ files containing SSH private keys

This risk should be considered even in the most basic workflows, such as one that accepts and triages an issue:

Because the workflow authenticates via GCP VertexAI rather than Gemini API key, it has to rely on a previous google-github-actions/auth@v2 action that creates a local json file with GCP service account credentials. At this point, it's up to prompt engineering to get Gemini to exfiltrate the file. A simple Issue asking for Changelog inventorization works perfectly:

Gemini refuses the issue creation step but prints the inventory directly in the workflow log:

Dangers of Verbosity Mode

Security vulnerabilities from leaked secrets in CI/CD output, particularly from debug or diagnostic logs, are a well-documented risk. This danger extends to AI actions with verbosity modes enabled, such as show_full_output: true in claude-code-action or gemini_debug: true in run-gemini-cli action.

For instance, the payload mentioned in the preceding section works against claude-code-action when show_full_output: true is active. While the model in test (claude-sonnet-4-5@20250929) is sophisticated enough to identify sensitive secrets within a file, it must first read the file to do so. Crucially, the outcome of every step, including the file reading operation, becomes visible in the workflow run log, exposing the secrets on the way:

Inherent Risks

In addition to the above security findings, there are several inherent risks that come with the AI usage that every action author must remember.

Absence of Built-in Permission Validation: Most AI actions delegate permission validation entirely to workflow authors. This is a design choice, not a bug - but it shifts responsibility to users who may not understand the security implications. Only two actions - claude-code-action and codex-action - implement their own permission checks (invoking actors must have Write access by default), while others have no built-in permissions validation. This means that there is a set of workflow triggers that are available to any actor from the untrusted zone and thus should be considered dangerous:

Actions without access controlEvents triggered from Untrusted Zone
  • actions/ai-inference
  • anthropics/claude-code-security-review
  • google-github-actions/run-gemini-cli
  • qodo-ai/pr-agent
  • coderabbitai/ai-pr-reviewer

No access to secrets:

  • pull_request from fork

With access to secrets:

  • pull_request_target
  • pull_request_review_comment
  • issues
  • issue_comment
  • discussion
  • discussion_comment

Partial Sanitization of User-Controlled Content: Prompt injection is an inherent challenge for all LLM-based systems. This risk is not theoretical - multiple Claude Code CVEs explicitly note that "reliably exploiting this requires the ability to add untrusted content into a Claude Code context window" (GHSA-x56v-x2h6-7j34, GHSA-qxfv-fcpc-w36x, GHSA-qgqw-h4xq-7w8w). AI GitHub Actions provide exactly this capability: they interpolate PR titles, issue bodies, and comments directly into prompts. Unfortunately, no complete mitigation exists; defenses are probabilistic and the severity depends on what tools the AI can access, the permissions the workflow runs with, and its access to secrets. All reviewed AI actions interpolate attacker-controlled content into prompts, albeit some make the partial effort to sanitize it, for example in anthropics/claude-code-action:

# code at src/github/utils/sanitizer.ts:54-62
export function sanitizeContent(content: string): string {
  content = stripHtmlComments(content);
  content = stripInvisibleCharacters(content);
  content = stripMarkdownImageAltText(content);
  content = stripMarkdownLinkTitles(content);
  content = stripHiddenAttributes(content);
  content = normalizeHtmlEntities(content);
  content = redactGitHubTokens(content);
  return content;
}

MCP / Tool Access: Actions with MCP (Model Context Protocol) or direct tool access enable AI to perform operations beyond text generation. Following is the table summarizing default tools and capabilities of the top four reviewed actions:

ActionFile WriteShellGit PushCreate PRsFork Repos
claude-code-actionYESGit onlyYESVia MCPVia MCP
codex-actionWorkspaceSandboxedLimitedNoNo
run-gemini-cliYESLimitedYESYESYES
ai-inferenceNoNoNoVia MCPVia MCP

Each row in this table is a tool an injected prompt can reach. Treat the cells as blast-radius indicators, not features.

Disclosures

Following this research, OpenAI has made a series of changes ([1],[2]) to introduce a more granular allow-bot-users setting and to remove dependabot from the trusted zone. Anthropic has made changes to their documentation to make it clear that allowed bots don’t need to have Write access. In addition, we have reported further examples of vulnerable workflows in open source repositories to the appropriate project maintenance teams.

Recommendations

Beyond the immediate disclosures, we offer a set of recommendations grouped by the target audience:

Workflow authors:

  • Understand GitHub Actions and AI-augmented threat model. Do not rely on core models to prevent prompt injection.

  • When using anthropics/claude-code-action and openai/codex-action - avoid insecure allow-bots and allowed_bots settings.

  • When using google-github-actions/run-gemini-cli and other actions without access control - add author association checks for comment triggers:

jobs:
  ai-review:
    if: |
      contains(fromJSON('["OWNER","MEMBER","COLLABORATOR"]'),
        github.event.comment.author_association)
  • Avoid combining AI action with wide access and authentication actions generating sensitive local files in the same workflow.

  • Avoid using verbosity mode in public repositories.

Actions authors:

  • For actions without built-in access permission checks - implement secure defaults with built-in permission validations similar to claude-code-action and codex-action. By default, users without Write access should not be able to supply input to AI models.

  • Document security requirements explicitly, particularly what actors can trigger the action.

  • Log warnings for dangerous configurations.

Wiz customers: use the rich set of CCRs and Controls that forms the CICD Security category of "Wiz for Code and Supply Chain Security Framework": 

These controls ensure you do not introduce relevant risks into your CI/CD environment.

Continua a leggere

Richiedi una demo personalizzata

Pronti a vedere Wiz in azione?

"La migliore esperienza utente che abbia mai visto offre piena visibilità ai carichi di lavoro cloud."
David EstlickCISO (CISO)
"Wiz fornisce un unico pannello di controllo per vedere cosa sta succedendo nei nostri ambienti cloud."
Adam FletcherResponsabile della sicurezza
"Sappiamo che se Wiz identifica qualcosa come critico, in realtà lo è."
Greg PoniatowskiResponsabile della gestione delle minacce e delle vulnerabilità