Large Language Models (LLMs) are good at coding. This has powered emerging AI-assisted programming tools, ranging from GitHub Copilot to AI IDEs and IDE extensions like Claude Code, Cursor, Windsurf, Codeium, and Roo.
It’s also democratized some elements of development, birthing Vibe Coding, where you “forget that the code even exists.”
Not all AI-assisted coding is vibe coding. Regardless, security risk emerges the more developers are removed from the details of generated code, and the more non-developers gain access to vibe code risky software.
LLMs Generate Insecure Code, Vibe Coders Moreso
It’s clear that currently AI-Generated code is not secure by default. Based on BaxBench, somewhere between 25% and 70% of working coding outputs from leading models contain vulnerabilities. Some research has also indicated that users given an AI assistant will produce more vulnerable code, primarily due to excessive confidence in the generated code. Recent model improvements have dramatically reduced vulnerability prevalence, but there is still more to be done.
When we look specifically at vibe coding, there is an even more stark increase in risk. Notable anecdotes have crossed into broad awareness, like the vibe coding SaaS entrepreneur whose application unfortunately became a security piñata from hardcoded secrets, missing authentication and authorization checks, and more:
It’s continues to happen:
Improving the Security of AI-Generated Code
Traditional software security tooling still has merit in securing code, no matter the involvement of AI. SAST, SCA, and Secrets scanning all have a role to play. The emergence of AI-Assisted Programming increases the importance of shifting such scans left, to the degree possible, into the IDE. PR time scanning and remediation continues to be crucial. Creation and adoption of secure by default frameworks and libraries can also reduce risk.
Legacy approaches aside, AI coding assistants have brought with them one new capability to exert leverage on application security: Rules Files.
Rules Files
Rules files are an emerging pattern to allow you to provide standard guidance to AI coding assistants.
You can use these rules to establish project, company, or developer specific context, preferences, or workflows. Most popular AI coding tools offer rules, ranging from GitHub Copilot Repository Custom Instructions or Codex AGENTS.md, Aider Conventions, Cline Custom Instructions, Claude CLAUDE.md, Cursor Rules, and Windsurf Rules.
Rules Files for Security
Research consistently finds that crafted prompts can significantly reduce the number of vulnerabilities in AI-generated code. Rules files offer an ideal method to centralize and standardize these security-focused prompting improvements.
To methodically craft a rules file for security, we consider best practice for rules files, the vulnerabilities most common in AI-Generated code, and research-based best practices in prompting for secure code generation.
Best practices for rules files
focus on crafting instructions that are clear, concise, and actionable
tailor rules to their relevant scope, such as a particular programming language
break down complex guidelines into smaller, atomic, and composable rules
keep the overall rules concise; under 500 lines
Insecure habits in AI-generated code
In general, systematic review shows that CWE-94 (Code Injection), CWE-78 (OS Command Injection), CWE-190 (Integer Overflow or Wraparound), CWE-306 (Missing Authentication for Critical Function), and CWE-434 (Unrestricted File Upload) are all common. Of course, there is methodological bias here, as most studies specifically target MITRE’s CWE Top 25 Most Dangerous Software Weaknesses.
Unsurprisingly, language also matters when it comes to software vulnerabilities. These patterns, of course, map to common vulnerabilities, regardless of AI involvement. For example, memory management issues are overrepresented in C/C++. For Python, on the other hand, deserialization of untrusted data and XML-related vulnerabilities are more common.
Research based best practices in prompting for secure code generation
Early investigations in this domain found that “a simple addition of the term "secure" to the prompt led to a reduction in the average weakness density of the generated code by 28.15%, 37.03%, and 42.85% for GPT-3, GPT-3.5, and GPT-4.” The same paper found that a prompt highlighting top CWEs was most effective (“Generate secure Python code that prevents top security weaknesses listed in CWE for the following:”), compared to persona-based and naive security prompts.
Of possible generic prefixes, a different study found “You are a developer who is very security-aware and avoids weaknesses in the code.” was able to reduce the risk of vulnerable code in a single pass by 47-56% on average, depending on the model. This outperformed alternatives like “make sure every line is secure” and “examine your code line by line and make sure that each line is secure.”
Open-Sourcing Secure Rules Files
Rules files have yet to see broad adoption for security use cases. The best security rules file is one custom to your organization. However, to help unstick the blank page problem, we’re open sourcing a set of baseline secure rules files. These rules were created with the help of Gemini, using a prompt that encodes the guidance outlined above.
Check out the open-source rules files over on GitHub!We’ve generated rules targeting the a set of common languages and frameworks:
Python: Flask, Django
Javascript: React, Node.js
Java: Spring
.NET: ASP.NET Core
Compatible rules are available for all of the following popular AI coding assistants and tools:
Cursor Rules
Cline Rules
Claude CLAUDE.md
Windsurf Rules
Codex AGENTS.md
Aider Conventions
GitHub Copilot Copilot Instructions
If you're currently generating code with AI, and you're not taking advantage of a rules file for security benefit, start here.
We’re also open sourcing the prompt used. We welcome contributions, whether they add coverage for additional languages and frameworks, or offer evidence based improvements to the prompt.
References
Not all AI-assisted programming is vibe coding (but vibe coding rocks)
Andrej Karpathy tweet coining “Vibe Coding”
Securing React: Prompt Engineering for Robust and Secure Code Generation - Jim Manico
Academic Papers
Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions
SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories
Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models
CWEVAL: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Security and Quality in LLM-Generated Code: A Multi-Language, Multi-Model Analysis
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?
Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Replication Study
A systematic literature review on the impact of AI models on the security of code generation
From Vulnerabilities to Remediation: A Systematic Literature Review of LLMs in Code Security
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data