Introducing the Red Agent POV Series

Malicious actors are actively weaponizing AI to scan and exploit public-facing environments at unprecedented speed and scale. To stay ahead, defenders have had to evolve- and AI-powered offensive security has made massive leaps. Current frontier models, paired with our purpose-built agent harness, are already finding real, exploitable, multi-step chains on a daily basis. With these models, AI-powered offensive security is now able to find flaws that would have seemed out of reach for automated testing just a year ago. These flaws are non-trivial even for expert pentesters and bug bounty hunters working manually.

This is why we built the Red Agent, our AI-powered pentester that operates at machine speed to help teams stay ahead in the AI era. By continuously reasoning about application behavior, it synthesizes the kind of complex, logic-driven vulnerabilities and multi-step attack chains that human testers take days of manual work to uncover.

When adversaries are continuously scanning your perimeter with AI, relying on periodic manual testing will not keep up. For security and offensive teams, turning on the Red Agent is an immediate necessity to find and close your critical exploitable risks before adversaries find them.

We are excited to introduce a new blog series from the Wiz Research team,, where we pull back the curtain to give you an inside look at how the Red Agent uncovers these complex, exploitable risks in production. Throughout this series, we will focus on specific bug classes discovered by the Red Agent and share real examples of how it reasons through APIs and finds context-driven vulnerabilities. Today, we are launching the first blog in this series, detailing how the Red Agent uncovered a critical SSRF vulnerability in production systems.

What is the Red Agent?

The Red Agent is Wiz's AI-powered pentester, built to continuously discover logic-driven vulnerabilities and misconfigurations across publicly facing environments. It condenses what traditionally took human testers hours or days of manual work into an autonomous, continuous process, operating at machine speed without sacrificing depth. The Red Agent does so by reasoning about the application- it builds hypotheses from failed probes, accumulates constraints from blocked attempts, and synthesizes multi-step attack paths that only emerge from understanding how an application actually behaves. When a request gets blocked, the Red Agent uses that as a data point to narrow the solution space for the next attempt. This allows it to uncover sophisticated attack chains at scale, giving defenders the ability to find and remediate critical risks before adversaries can exploit them.

Red Agent in the wild

To give an idea of what defending at machine speed looks like, we looked at the aggregate data from the Red Agent’s performance over a one-month period.

Operating at a scale that would be impossible for a human alone, the Red Agent completed hundreds of thousands of autonomous scans across ~1,000 environments. In that window, it surfaced over 17,000 unique findings which included over 5,500 high and critical vulnerabilities, which represent validated, multi-step attack chains in production environments.

Here is a high-level look at the key takeaways from those findings:

Access control remains the dominant failure mode
Authorization and access control flaws remain the single biggest gap in modern cloud applications. 54% of all unique findings stemmed from broken access control. This includes authentication bypasses, unrestricted access to components or sensitive information, IDOR/BOLA, BFLA, and default credentials. These represent real, production applications that are routinely shipped with entirely unprotected management APIs and exposed internal endpoints.
Leaked secrets present a massive, high-severity footprint
Insecure secrets expand the blast radius across cloud environments exponentially. Among all exposed secrets discovered by the Red Agent, over 61% are severe enough to be categorized as Critical or High severity. A closer look at the data shows exactly what types of secrets are most commonly exposed, proving that a significant percentage of these leaks directly expose core structural components:
- 17.6% are exposed cloud credentials
- 16.5% are leaked API keys
- 8.4% are exposed JWT or session tokens
- 4.7% are exposed private keys and TLS certificates
This architecture-wide exposure proves that hardcoded credentials are fundamentally granting attackers immediate, administrative-level entry points across infrastructure.

Remote Code Execution (RCE): Systemic, high-impact risk
RCE is typically treated as rare, but in practice, the Red Agent found code execution paths across a significant share of environments- not edge cases, but validated RCE with proof of exploitability.
SQL injection: A consistently severe risk
SQL and NoSQL injection remain among the oldest vulnerability classes in web security, as well as among the most severe. Over 52% of injection findings were of Critical or High severity. SQLi typically means direct database access, sensitive data leakage, and reputational damage, and in many cases can open a path to a deeper compromise or even RCE.
PII and PHI exposure is prominent
Red Agent found that data privacy risks are actively exposed in production- this is the breakdown across tested environments:
- 54.1% of the exposures directly involved internal employee records.
- 28.6% exposed standard user PII.
- 10.2% compromised customer records.
- 5.8% exposed sensitive patient data (PHI).
- 1.3% leaked student records.
JWT alg:none quietly persists in production
We were surprised to see how frequent the alg:none misconfiguration was, accounting for 63.9% of all JWT authentication bypass findings. Out of the JWT bypass findings, another 8.3% of cases failed to validate signatures altogether, with the remaining 27.8% being due to miscellaneous JWT infrastructure bugs. We see this issue persisting due to the underlying token-validation libraries defaulting to insecure modes, and development teams continue to ship code without overriding those defaults-essentially leaving a backdoor open for trivial authentication bypass.

Read the first Red Agent POV blog

Seeing the Red Agent chaining together these attack paths reveals the true power of an AI-driven offensive approach, leveraging a deep reasoning process that enables it to uncover the hidden, logic-driven vulnerabilities that require context to find.

The first blog in our series dives into a real-world scenario where the Red Agent uncovered a non-trivial SSRF in a GCP Cloud Run service with a URL parameter that only accepts GitHub links.

Over 3 scan runs and 96 iterations, the Red Agent built a mental model of the application's validation logic, learned from blocked attempts, and ultimately discovered a bypass technique that would be impossible to find leveraging signature-based scanning- escalating from SSRF to full credential and source code extraction, all without authentication.

Read the full breakdown in the blog. You can find all the Red Agent POV series blogs here, and stay tuned as we continue to share real-life Red Agent findings in the coming weeks.

Want to see the Red Agent in action? Schedule a live demo with our team.

Introducing the Red Agent POV Series

What is the Red Agent?

Red Agent in the wild

Read the first Red Agent POV blog

Continue reading

Wiz Exposure Management Dashboard: Your CTEM Command Center

Navigating the New Federal Logging Mandate | OMB Memorandum M-26-14

AI Threat Readiness Pillar 3: Perform AI Code Analysis Natively in Wiz

Ready to see Wiz in action?