What is RAG data poisoning?

RAG data poisoning is an attack where an adversary injects malicious documents into the knowledge base that a retrieval-augmented generation system relies on. When the LLM retrieves these poisoned documents as context, it generates attacker-controlled responses. Research shows that injecting as few as five documents into a database of millions can achieve a 90% attack success rate.

How does RAG poisoning differ from prompt injection?

Prompt injection targets the model's instruction layer directly through user input. RAG poisoning targets the data layer by corrupting the knowledge base the model retrieves from. The LLM treats retrieved documents as authoritative context, so poisoning the retrieval pipeline can be more effective than attacking the model directly, and no jailbreak is required.

What is OWASP LLM08:2025?

LLM08:2025 is a new entry in the OWASP Top 10 for LLM Applications that specifically addresses vector and embedding weaknesses. It covers unauthorized access to embeddings, cross-context information leakage in multi-tenant environments, embedding inversion attacks, data poisoning, and retrieval manipulation.

Can attackers recover original text from vector embeddings?

Yes. Embedding inversion attacks can recover 50 to 70 percent of input words from sentence embeddings. More advanced techniques like ZSinvert achieve over 80% sensitive information leakage rates. Embeddings should be treated with the same confidentiality as the original text.

How do you audit a RAG pipeline for security?

A RAG security audit covers the full pipeline: document ingestion validation, embedding storage access controls, retrieval logic integrity, multi-tenant isolation, output filtering, and monitoring. BeyondScale's AI security assessments include RAG-specific testing for data poisoning, cross-tenant leakage, embedding exposure, and access control failures.

RAG Security: How Attackers Poison Your Knowledge Base

Your RAG pipeline has a trust problem. Every document in your knowledge base becomes part of your LLM's context window, and your model treats all of it as authoritative. If an attacker can inject even a handful of documents into that knowledge base, they control what your AI says. This is not theoretical: published research demonstrates that five poisoned documents in a database of millions can redirect responses with 90% reliability.

This guide covers the specific attack classes that target RAG systems, the research quantifying their impact, and the controls that actually reduce risk. If you operate a RAG-powered application in production, this is the security surface you are probably not testing.

Why RAG Introduces Attack Surfaces Traditional AppSec Misses

Traditional application security focuses on input validation, authentication, and authorization at the API layer. RAG adds an entirely new data plane: the knowledge base and its retrieval pipeline. This data plane sits between your users and your model, and it operates under a different trust model than the rest of your application.

When a user sends a query, the retrieval system searches a vector database for semantically similar documents, passes those documents to the LLM as context, and the model generates a response grounded in that context. The security problem is straightforward: the LLM cannot distinguish between legitimate documents and poisoned ones. It treats everything it retrieves as ground truth.

This means your knowledge base is now a first-class attack surface. Compromising it does not require breaking into the model, cracking API keys, or exploiting a web vulnerability. It requires getting malicious content into the document pipeline, and in many enterprise deployments, that pipeline ingests data from wikis, shared drives, ticketing systems, and third-party feeds with minimal validation.

OWASP recognized this gap in 2025 by adding LLM08:2025, Vector and Embedding Weaknesses, as a new entry in the Top 10 for LLM Applications. It is the framework's acknowledgment that the retrieval layer is a distinct and undertested attack surface.

RAG Data Poisoning: How It Works

RAG data poisoning targets the knowledge base rather than the model itself. The attacker's goal is to inject documents that will be retrieved for specific queries and will cause the LLM to generate attacker-chosen responses.

The PoisonedRAG Attack

The most rigorously studied attack is PoisonedRAG, published at USENIX Security 2025 by researchers at Penn State and the Illinois Institute of Technology. The attack formulates document injection as an optimization problem with two conditions:

Retrieval condition: The malicious document must be retrieved when a target question is asked. The attacker crafts the document's content to maximize cosine similarity with the target query's embedding.

Generation condition: The malicious document must cause the LLM to generate a specific target answer. The attacker uses vocabulary engineering, inserting phrases like "according to updated records" or "corrected figures," to increase the model's likelihood of trusting the poisoned content.

The results are significant. In black-box settings where the attacker has no access to the model's internals, PoisonedRAG achieved attack success rates of 97% on Natural Questions, 99% on HotpotQA, and 91% on MS-MARCO. Injecting just five documents per target question into a knowledge base containing millions of documents was sufficient.

CorruptRAG: Single-Document Attacks

Research published in January 2026 introduced CorruptRAG, which reduces the attack to a single poisoned document. This is more realistic for environments where bulk injection would trigger anomaly detection. CorruptRAG achieves comparable success rates to PoisonedRAG while requiring fewer injections, making it harder to detect through volume-based monitoring.

The Wikipedia Edit Vector

A practical attack pattern that requires no sophisticated optimization: an attacker briefly edits a Wikipedia article, public documentation page, or GitHub README with poisoned content. If the RAG system's ingestion pipeline scrapes that source during a scheduled update, the poisoned version enters the vector database. Even after the edit is reverted at the source, the poisoned embedding persists in the knowledge base until the next full re-indexing cycle, which in many organizations happens weekly or monthly.

Vector Database Access Control Failures

The most common enterprise RAG vulnerability is not a sophisticated attack at all. It is inadequate access control on the vector database.

In a multi-tenant RAG deployment, documents from different users, teams, or customers share the same vector store. Without proper access partitioning, a query from User A can retrieve documents that belong to User B. This is cross-tenant data leakage, and it happens because many vector databases were not designed with per-document access control as a first-class feature.

The failure modes include:

Missing tenant isolation: All documents indexed into a single collection without metadata filtering. Any query can match any document regardless of the requester's authorization level.
Metadata-only filtering: Access control implemented as a metadata filter at query time, but the underlying vectors are accessible to anyone with database credentials. An attacker who compromises the database layer bypasses all tenant isolation.
Stale permission propagation: When a user's access is revoked in the source system (e.g., SharePoint, Confluence), the corresponding vectors retain the old permissions until the next re-indexing cycle.
No audit trail: Queries and retrievals are not logged, making it impossible to detect unauthorized cross-tenant access after the fact.

The fix is retrieval-native access control: every vector inherits the permissions of its source document, and the retrieval layer enforces those permissions at query time, not as a post-processing filter. Attribute-based access control (ABAC) provides the granularity that traditional RBAC often lacks in this context.

Embedding Inversion: Recovering Data from Stolen Vectors

Even if your access controls are solid, the embeddings themselves are a liability. Embedding inversion attacks demonstrate that vector representations are not one-way transformations. An attacker who gains access to your vector database can reconstruct significant portions of the original text.

Research on sentence embedding inversion shows that attackers can recover 50 to 70% of input words from standard sentence embeddings (measured by F1 score). More advanced techniques push this further: ZSinvert, published in March 2025, is a zero-shot inversion method that achieves over 80% sensitive information leakage rates across all tested encoders using the Enron email corpus. The ALGEN framework, published in February 2025, demonstrated that as few as 1,000 alignment samples are sufficient to mount a partially successful inversion attack on black-box encoders.

The implication is clear: sharing embeddings with third-party services, or storing them in insufficiently protected vector databases, is functionally equivalent to sharing the original documents. In regulated industries such as healthcare and finance, where the source documents contain PII, PHI, or trade secrets, this is a compliance failure waiting to happen.

Defenses include encrypting embeddings at rest and in transit, applying differential privacy noise during embedding generation, and for high-stakes deployments, using cryptographic approaches like homomorphic encryption that allow similarity search on encrypted vectors. None of these are simple to implement, but the research is unambiguous about the risk.

Multi-Tenant RAG: Cross-Tenant Leakage Patterns

Multi-tenant RAG deployments combine the access control problem with the data sensitivity problem. When multiple customers' data shares the same vector infrastructure, the blast radius of any single failure extends across tenant boundaries.

Common leakage patterns include:

Semantic overlap retrieval: Customer A asks a question whose embedding is semantically close to Customer B's confidential documents. Without strict tenant partitioning, those documents appear in Customer A's context window.

Shared embedding model leakage: If the embedding model is fine-tuned on data from multiple tenants, it may encode cross-tenant information into the embedding space itself.

Index-level side channels: Vector databases that use approximate nearest neighbor (ANN) indexes can leak information about the distribution of other tenants' data through query timing or result count patterns.

The strongest mitigation is physical isolation: separate vector database instances per tenant. When that is not feasible due to cost or operational complexity, the minimum controls are logical namespace isolation with per-query tenant filtering enforced at the database layer (not the application layer), combined with regular cross-tenant retrieval testing.

RAG Security Audit Checklist: 10 Controls to Validate

If you are responsible for a production RAG system, these are the controls to test:

Document Ingestion

Source validation: Are all document sources authenticated and verified? Is there a whitelist of approved data sources, or does the pipeline accept arbitrary inputs?

Content integrity: Are ingested documents scanned for injection payloads, authority manipulation phrases, and anomalous patterns before embedding?

Provenance tracking: Can you trace every document in the vector database back to its original source and ingestion timestamp?

Vector Storage

Access control: Does every vector inherit the permissions of its source document? Are permissions enforced at the retrieval layer, not just as application-level filters?

Tenant isolation: In multi-tenant deployments, are vectors physically or logically isolated per tenant? Can a query from one tenant ever retrieve another tenant's documents?

Embedding protection: Are embeddings encrypted at rest? Is access to raw embeddings restricted to prevent inversion attacks?

Retrieval and Generation

Retrieval monitoring: Are all queries and retrieved documents logged? Is there anomaly detection on retrieval patterns (e.g., sudden spikes in cross-namespace matches)?

Output filtering: Does the generation layer include guardrails that detect when retrieved context contains potential injection payloads or contradicts known-good information?

Operational

Re-indexing cadence: How quickly can poisoned documents be purged? Is there a process for emergency re-indexing when a compromise is detected?

Permission synchronization: When user access changes in source systems, how quickly do those changes propagate to the vector database? What is the maximum window of stale permissions?

Practical Recommendations

Based on the research and the audit patterns we see across RAG deployments, here is what to prioritize:

Start with access controls. The majority of RAG security failures are access control issues, not sophisticated poisoning attacks. Ensure tenant isolation, permission inheritance, and audit logging are in place before investing in advanced defenses.

Validate your ingestion pipeline. Implement content scanning for known poisoning patterns. Monitor for documents containing authority manipulation language ("CORRECTED", "OFFICIAL UPDATE", "SUPERSEDES PREVIOUS"). Track provenance for every document.

Treat embeddings as sensitive data. If the source documents are confidential, the embeddings are too. Apply the same encryption and access control standards to your vector database that you apply to your production database.

Test for poisoning. Include RAG poisoning scenarios in your AI security assessments. Inject test documents with known payloads and verify that your pipeline detects or mitigates them. Tools like Promptfoo and Garak support RAG-specific test scenarios.

Monitor retrieval patterns. Anomaly detection on retrieval logs can catch both poisoning attacks (unusual documents being retrieved for common queries) and access control failures (cross-tenant retrievals). If you are not logging retrievals, you cannot detect compromises.

For a deeper understanding of how RAG poisoning relates to direct prompt manipulation, see our guide to prompt injection attacks. For the broader LLM vulnerability landscape, including how LLM08:2025 fits alongside the other nine categories, see our OWASP LLM Top 10 walkthrough. If you are building agentic RAG systems, our agentic RAG architecture guide covers the patterns, but this post covers the security surface those patterns create.

Conclusion

RAG data poisoning is a practical, well-researched attack class with published tooling and demonstrated success rates against production-grade systems. The controls to mitigate it are known but underdeployed. If you are running a RAG application and have not specifically tested for poisoning, cross-tenant leakage, and embedding exposure, those are open risks.

Run a Securetom scan against your RAG-powered application to identify exposed attack surfaces, or book an AI security assessment for a full evaluation of your RAG pipeline security.

RAG Security: How Attackers Poison Your Knowledge Base

Why RAG Introduces Attack Surfaces Traditional AppSec Misses

RAG Data Poisoning: How It Works

The PoisonedRAG Attack

CorruptRAG: Single-Document Attacks

The Wikipedia Edit Vector

Vector Database Access Control Failures

Embedding Inversion: Recovering Data from Stolen Vectors

Multi-Tenant RAG: Cross-Tenant Leakage Patterns

RAG Security Audit Checklist: 10 Controls to Validate

Document Ingestion

Vector Storage

Retrieval and Generation

Operational

Practical Recommendations

Conclusion

AI Security Audit Checklist

Sai Rajasekhar Kurada

Related Articles

Slack AI Enterprise Security: CISO Hardening Guide 2026

LLM Observability Security Risks: CISO Guide 2026

Deepfake CEO Fraud: Voice Cloning Defense Playbook 2026

Ready to Secure Your AI Systems?