Research & Incidents

AI-Generated Code Breaches

Q: What types of vulnerabilities do AI coding tools commonly introduce?

AI coding tools commonly introduce these vulnerability types: 1) Injection vulnerabilities (SQL injection, command injection, XSS) from generating code that does not sanitize user input, 2) Authentication and authorization flaws from implementing client-side-only security checks, 3) Hardcoded credentials and secrets in source code, 4) Insecure cryptographic choices (weak hashing, predictable random numbers), 5) Missing input validation, 6) Insecure deserialization, 7) Information disclosure through verbose error handling, 8) Missing security headers and configuration. The specific vulnerabilities depend on the language and framework being used.

Q: Is AI-generated code getting more secure over time?

AI code generation tools are gradually improving their security awareness, but the improvement is uneven and insufficient for production use without review. GitHub has added Copilot code scanning features, and models have been fine-tuned to avoid some common vulnerability patterns. However, fundamental limitations remain: AI models do not understand the security context of the application being built, they cannot reason about authorization requirements, and they optimize for functionality over security. The security of AI-generated code depends heavily on the prompt, the surrounding context, and whether a human reviews the output.

Q: Should I stop using AI coding tools due to security concerns?

No, but you should use them with appropriate security guardrails. AI coding tools dramatically increase productivity and are here to stay. The key is to treat AI-generated code as a first draft that requires security review, not as production-ready output. Adopt these practices: 1) Never deploy AI-generated code without review, 2) Use security scanning tools (like VAS) before deployment, 3) Be especially careful with AI-generated authentication and authorization code, 4) Never let AI handle secrets management, 5) Run automated security tests in your CI/CD pipeline, 6) Educate yourself on common vulnerability patterns so you can spot them in AI output.

Q: Are there statistics on AI code generation security?

Multiple academic studies have quantified AI code generation security issues: The Stanford study (2023) found developers with AI assistance produced significantly less secure code while being more confident about its security. NYU research (2024) found ~40% of Copilot suggestions contained vulnerabilities. A 2024 systematic review of 27 studies found that AI-generated code had a vulnerability rate of 30-50% across different languages and tasks. GitHub's own research found that Copilot improved code quality in some dimensions but security was not one of them. These statistics highlight the need for security review of all AI-generated code regardless of the tool used.

When Copilot, Cursor, and vibe coding tools introduce security vulnerabilities.

Academic research, real-world incidents, and statistics showing how AI code generation tools systematically introduce security vulnerabilities. The data is clear: AI-generated code needs security review before deployment.

Scan Your AI-Built App

VAS detects the specific vulnerabilities AI tools introduce

AI Code Generation Security Statistics

40%

of Copilot suggestions contain vulnerabilities

72%

of developers use AI code generation

23%

regularly review AI code for security

58%

trust AI code as much as their own

The Broader Trend of AI Code Vulnerabilities

The adoption of AI code generation tools has been one of the fastest technology shifts in software development history. Within two years of GitHub Copilot's launch, the majority of professional developers began using AI assistants in their daily workflow. By 2026, AI-generated code is estimated to account for a significant portion of all new code written globally.

This transformation has brought enormous productivity gains. Developers report writing code 30-55% faster with AI assistance. Boilerplate code, repetitive patterns, and unfamiliar APIs become accessible instantly. But academic research and real-world incidents have revealed a troubling pattern: the code that AI generates, while functional, is systematically less secure than code written by developers without AI assistance.

The security problem with AI-generated code is not random. It follows predictable patterns that reflect how large language models learn from training data. LLMs learn from millions of open-source repositories, many of which contain insecure patterns, deprecated practices, and code that was never intended for production use. When the AI generates code, it reproduces these patterns with confidence, and developers accept them without question because the code looks correct and functions properly.

This page examines the academic research, catalogs real-world incidents, and provides practical guidance for developers who want to use AI coding tools safely. The goal is not to discourage AI-assisted development but to ensure that security review becomes an automatic part of the AI-assisted workflow.

Vulnerability Types in AI-Generated Code

Based on aggregate data from multiple research studies, these are the most common vulnerability categories found in AI-generated code:

Injection (SQLi, XSS, Command)

28%

Authentication/Authorization

22%

Credential Exposure

18%

Cryptographic Issues

12%

Input Validation

10%

Information Disclosure

Other

Academic Research

Do Users Write More Insecure Code with AI Assistants?

Perry et al. (Stanford University)•2023•IEEE Symposium on Security and Privacy

Key Findings

Participants with AI assistance wrote significantly less secure code
AI assistant users were more confident in their code's security
The security overconfidence effect was consistent across tasks
AI-assisted developers were less likely to identify vulnerabilities in their code

Significance: This was the first large-scale controlled study demonstrating that AI assistants can actively harm code security, not just fail to improve it.

Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions

Pearce et al. (NYU)•2022-2024•IEEE Symposium on Security and Privacy

Key Findings

Approximately 40% of generated code contained security vulnerabilities
Copilot generates vulnerable patterns even when prompted for security
CWE-79 (XSS) and CWE-89 (SQL Injection) were most common
Vulnerability patterns were consistent across multiple programming languages

Significance: Provided the first systematic evaluation of Copilot's security across the MITRE CWE top 25 vulnerability categories.

Security Implications of Large Language Models in Software Development

Multiple research groups•2024•Systematic review across 27 studies

Key Findings

30-50% vulnerability rate across AI coding tools
Security quality varies significantly by programming language
Python and JavaScript showed highest vulnerability rates
AI tools performed worse on security-sensitive tasks than general coding

Significance: Established a consensus view across multiple independent studies that AI code generation poses measurable security risks.

LLM-Generated Code and Security: A Developer Survey

Various institutions•2024-2025•Multiple venues

Key Findings

72% of developers report using AI code generation in their workflow
Only 23% report regularly reviewing AI-generated code for security
58% trust AI-generated code as much or more than their own for security
Developers with less security training show higher trust in AI output

Significance: Revealed a dangerous gap between the security of AI-generated code and developers' perceptions of its security.

Real-World Incidents

Copilot Suggests Hardcoded AWS Credentials

GitHub CopilotCredential Exposure (CWE-798)

GitHub Copilot suggested AWS access keys and secret keys in code completions, apparently learned from public repositories where these keys had been committed. Multiple developers reported accepting these suggestions without realizing they contained real credentials from other projects.

Impact

Potential use of other users' AWS credentials; exposed credentials in new repositories

AI-Generated SQL Without Parameterization

Multiple AI toolsSQL Injection (CWE-89)

AI coding assistants consistently generate SQL queries using string concatenation instead of parameterized queries, especially in Python and JavaScript. This pattern is one of the most reliable vulnerability generators across all AI coding tools.

Impact

SQL injection vulnerabilities in production applications, leading to data breaches

Cursor-Generated Auth with Token in localStorage

CursorInsecure Storage (CWE-922)

When asked to implement authentication, Cursor generated code storing JWT tokens in localStorage. While functional, this makes tokens accessible to any JavaScript running on the page, enabling XSS-based session theft. HttpOnly cookies are the secure alternative.

Impact

Session hijacking through XSS attacks on production applications

Vibe-Coded API Without Rate Limiting

Multiple vibe coding toolsUncontrolled Resource Consumption (CWE-400)

AI-generated APIs consistently lack rate limiting, request throttling, and cost controls. When applications go viral or are targeted by attackers, the absence of these controls leads to service degradation, denial of service, and unexpected cloud bills reaching tens of thousands of dollars.

Impact

Financial losses from cloud bill shock, service outages, denial of service

Insecure Random Number Generation for Tokens

Copilot, CursorUse of Insufficiently Random Values (CWE-330)

AI assistants frequently suggest using Math.random() or similar non-cryptographic random number generators for creating session tokens, reset codes, and API keys. These values are predictable and can be guessed by attackers.

Impact

Token prediction enabling account takeover, password reset bypass

Why AI-Generated Code Is Systematically Insecure

Training Data Reflects Insecure Practices

LLMs learn from millions of repositories, many containing insecure code, tutorials, and StackOverflow answers that prioritize functionality over security. The AI reproduces the most common patterns, which are not the most secure.

No Understanding of Security Context

AI does not understand what data your application handles, who should access what, or what compliance requirements apply. It cannot reason about authorization because it does not understand your business logic.

Optimization for Functionality

AI models are trained and evaluated on whether the generated code works correctly, not whether it is secure. Code that compiles, passes tests, and produces the right output scores well even if it has security flaws.

Security Overconfidence Effect

Developers using AI assistants report higher confidence in their code's security while producing less secure code. The professional appearance of AI output creates a false sense of security.

Missing Security-by-Default Patterns

Secure defaults (HTTPS enforcement, CSRF tokens, RLS policies, security headers) are opt-in in most frameworks. AI tools rarely add security features that are not explicitly requested.

Rapid Development Reduces Review

When AI generates code 50% faster, the time saved rarely goes to security review. Instead, developers ship faster, reducing the window for catching security issues before deployment.

Using AI Code Tools Safely

The solution is not to stop using AI coding tools. The productivity gains are too significant to ignore, and AI-assisted development is becoming the norm. Instead, the solution is to treat AI-generated code the way you would treat code from a junior developer: assume it works but verify its security before deploying to production.

Code Review Practices

Review all AI-generated auth and authorization code manually
Check for hardcoded credentials in every AI suggestion
Verify that SQL queries use parameterized statements
Ensure input validation exists on both client and server
Check that error handling does not expose internal details

Automated Security Checks

Run DAST scanning (VAS) on deployed applications
Use SAST tools in your CI/CD pipeline
Enable dependency scanning (Snyk, Dependabot)
Set up secret scanning on your repository
Implement automated security tests for critical flows

Prompting for Security

Explicitly ask AI tools to include input validation
Prompt for server-side authentication checks
Request environment variable usage for all credentials
Ask for security headers in configuration files
Request rate limiting on public API endpoints

Security Knowledge

Learn OWASP Top 10 vulnerabilities to spot them in AI output
Understand your framework's security features and defaults
Know the difference between authentication and authorization
Learn how your database platform (Supabase/Firebase) security works
Stay current on AI code security research and best practices

Scan Your AI-Generated Application

VAS detects the specific vulnerability patterns that AI code generation tools introduce. From exposed credentials to missing authentication, get a comprehensive security report for your application in minutes.

Get Starter Scan - $5 View Sample Report

Frequently Asked Questions

Does GitHub Copilot generate insecure code?

Research has consistently shown that AI coding assistants including GitHub Copilot can generate code with security vulnerabilities. A Stanford study found that participants using AI assistants wrote significantly less secure code than those coding without AI assistance. A NYU study found that approximately 40% of Copilot-generated code snippets contained security vulnerabilities. Developers using Copilot also tended to be more confident in the security of their code, despite it being less secure.

What types of vulnerabilities do AI coding tools commonly introduce?

AI coding tools commonly introduce: injection vulnerabilities (SQL injection, XSS) from code that does not sanitize user input, authentication and authorization flaws from client-side-only security checks, hardcoded credentials, insecure cryptographic choices, missing input validation, insecure deserialization, information disclosure through verbose error handling, and missing security headers.

Is AI-generated code getting more secure over time?

AI code generation tools are gradually improving but the improvement is insufficient for production use without review. GitHub has added Copilot code scanning features, and models have been fine-tuned to avoid some vulnerability patterns. However, AI models fundamentally do not understand the security context of applications being built and cannot reason about authorization requirements.

Should I stop using AI coding tools due to security concerns?

No, but use them with appropriate security guardrails. Treat AI-generated code as a first draft requiring security review. Use scanning tools like VAS before deployment, be careful with AI-generated auth code, never let AI handle secrets management, and run automated security tests in your CI/CD pipeline.

Are there statistics on AI code generation security?

Multiple studies have quantified the issue: 40% of Copilot suggestions contain vulnerabilities (NYU), developers with AI produce less secure code while being more confident (Stanford), 30-50% vulnerability rate across tools (systematic review of 27 studies), and only 23% of developers regularly review AI code for security (developer survey).

Related Resources

Vibe Coding Data Breaches Copilot Code Vulnerabilities AI Code Vulnerabilities Security Best Practices

Last updated: February 2026