Vector database security is the most commonly overlooked layer in enterprise AI stacks. Organizations spend significant effort securing LLM API keys and system prompts, while the databases storing their most sensitive embedded data ship with authentication disabled by default.
This guide covers the four primary attack classes against vector stores, specific misconfiguration patterns in Pinecone, Weaviate, Chroma, and Qdrant, recent CVEs with exploit paths, and a 15-point hardening checklist applicable to any RAG deployment.
Key Takeaways
- Embeddings are not anonymous data. Research shows 92% of input text can be reconstructed from embeddings alone using the Vec2Text method (Morris et al., EMNLP 2023).
- A 2025 UpGuard survey found 1,170 internet-accessible Chroma databases, roughly one-third actively exposing production data with no authentication.
- CVE-2025-64513 in Milvus (CVSS 9.3) required a single HTTP header with a hardcoded constant to bypass all authentication. A public PoC scanner exists.
- PoisonedRAG achieved a 99% attack success rate on HotpotQA by injecting 5 documents into a million-document corpus (USENIX Security 2025).
- OWASP added LLM08:2025 "Vector and Embedding Weaknesses" as a new category in the 2025 LLM Top 10, reflecting how quickly RAG became the dominant enterprise AI architecture.
- Pinecone namespaces are not security boundaries. Weaviate only added RBAC in v1.29.0. Qdrant's official documentation explicitly states instances are insecure by default.
- Including vector database scope in AI security assessments is now standard practice for any organization running RAG in production.
Why Vector Databases Are the Silent Risk in Enterprise AI
The typical enterprise RAG architecture moves proprietary business data through four stages: source documents (contracts, clinical notes, source code, financial records), an embedding model, a vector database, and a retrieval-augmented generation pipeline that feeds context to an LLM.
Organizations generally treat the LLM as the security perimeter. Access controls, prompt filtering, output monitoring all focus on the LLM boundary. The vector database sits behind that perimeter, often on internal networks with permissive defaults, holding high-fidelity representations of the most sensitive data in the organization.
The core assumption that makes this feel safe is wrong: embeddings look like opaque numeric arrays, so practitioners treat them as anonymized. They are not.
John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, and Alexander M. Rush at Cornell published Text Embeddings Reveal (Almost) As Much As Text at EMNLP 2023. Their Vec2Text method reconstructed 92% of 32-token input texts exactly from their embeddings. The method works iteratively: generate a hypothesis text, embed it, minimize cosine distance to the target embedding through correction steps. No model weights required. Only the embedding API.
Applied to MIMIC-III clinical notes, the method recovered full patient names with high accuracy. The 2025 follow-on paper (arXiv:2504.16609) reported cosine similarity of 89.4 and an LLM Judge leakage score of 92.0 against black-box encoders. A zero-shot variant (arXiv:2504.00147, April 2025) achieved meaningful inversion without any training on the target model at all.
The practical implication: if an attacker can query your vector database's embedding store, they can recover the source text. Every exposure is a data breach.
The Four Attack Classes Against Vector Stores
Embedding Inversion
Covered above. The attack requires read access to stored embeddings. Common exposure paths include overly broad API keys, namespace RBAC gaps, and insecure vector database endpoints reachable from application tiers without additional authentication.
Defense: apply Gaussian noise to stored embeddings. The Vec2Text paper demonstrated that well-calibrated noise significantly degrades reconstruction accuracy while preserving retrieval performance. The SPARSE framework (arXiv:2602.07090, February 2026) provides a more sophisticated approach using dimension-sensitive Mahalanobis noise that targets semantically critical embedding dimensions while minimally affecting retrieval quality.
Knowledge Base Poisoning
The PoisonedRAG attack (Zou, Geng et al., USENIX Security 2025) defines the threat clearly. An attacker injects a small number of adversarially crafted documents into the vector store. The documents are crafted to cluster near legitimate high-frequency queries in embedding space. When a user issues the target query, the retriever surfaces the malicious document, and the LLM generates the attacker's intended answer.
Results: 90%+ attack success rate injecting 5 malicious documents into a corpus of millions. In a black-box setting using PaLM 2: 97% ASR on Natural Questions, 99% on HotpotQA, 91% on MS-MARCO. The paper evaluated standard defenses (perplexity filtering, paraphrasing, semantic similarity checks) and found all of them insufficient.
Snyk Labs demonstrated the attack in practice with RAGPoison: inserting 274,944 poisoned vector points with prompt injection payloads into an unprotected Qdrant instance took approximately 80 seconds using standard Python batching. The exploit path is straightforward: write access to Qdrant (the default configuration), batch-insert embeddings carrying instruction payloads, and the payload executes when retrieved into LLM context.
The real-world Slack AI incident (August 2024, documented by PromptArmor) showed the same class of attack in a production system. An attacker posted malicious instructions in a public Slack channel. Slack AI's RAG pipeline retrieved it as context. The embedded instructions tricked the AI into constructing a phishing link that exfiltrated data from private channels the attacker had no access to.
RBAC Bypass and Namespace Boundary Violations
Pinecone namespaces are logical partitions within a single index, not security boundaries. A compromised or overly scoped API key with index-level access reaches all namespaces in that index. CVE-2024-41892 (CVSS 7.5) demonstrated the consequence: the RBAC check was performed after returning vector data rather than before. The result was that 200,000+ healthcare embeddings crossed organization boundaries.
IronCore Labs' assessment characterized Pinecone's RBAC model as operating entirely at the application layer: "Pinecone can't and won't ever enforce the same permissions on the data as the native data stores." Cryptographic isolation between tenants is absent.
Weaviate's pre-v1.29.0 authorization model provided only two roles: Admin and Read-Only. Weaviate's own documentation explicitly recommends migrating existing deployments from Admin List to RBAC, noting that Admin List "only provides coarse-grained access control." The RBAC module, released in v1.29.0, provides fine-grained permission management at the collection level. Many enterprise deployments are on older versions.
Misconfiguration Exposure
The default security posture of widely deployed vector databases is open.
Qdrant's official documentation states: "By default, all self-deployed Qdrant instances are not secure. They are open to all network interfaces and do not have any kind of authentication configured. They may be open to everybody on the internet without any restrictions."
Qdrant added static API keys and read-only API keys in earlier versions, and JWT-based granular access control in v1.9.0. TLS is not enabled by default. Without TLS, API key authentication is vulnerable to network interception.
Chroma's default configuration has no authentication. A 2025 UpGuard survey scanned the internet and found 1,170 accessible Chroma databases. Approximately 406 (roughly one-third) were actively exposing data to anonymous access. 60% of those had more than one collection, indicating production use rather than test instances. One exposed instance had 4,315 collections.
The broader landscape from 2024 analysis shows the same pattern: Chroma, Weaviate, Milvus, and Qdrant all have significant percentages of publicly reachable instances running without authentication.
CVE Analysis: Recent Critical Vulnerabilities
CVE-2025-64513: Milvus Authentication Bypass (CVSS 9.3)
This is the clearest example of what default-insecure looks like at the code level. The AuthenticationInterceptor function in internal/proxy/authentication_interceptor.go base64-decodes an incoming HTTP header value and compares it against a hardcoded constant: @@milvus-member@@.
Any attacker who knows this constant (it is in the open-source repository) can forge the header and bypass all authentication. The result is unauthenticated full administrative access: read, modify, and delete all data; execute privileged administrative operations.
Exploitation requires a single crafted HTTP request. No prior credentials. A public PoC scanner is available. Affected versions: Milvus 2.4.0 through 2.4.23, 2.5.0 through 2.5.20, and 2.6.0 through 2.6.4. Patched in 2.4.24, 2.5.21, and 2.6.5.
If you operate Milvus in any of the affected version ranges, this is a patch-immediately issue. Temporary mitigation: strip the sourceID header at your API gateway before it reaches the Milvus proxy.
CVE-2025-68664: LangChain Serialization Injection (CVSS 9.3)
This vulnerability shows how the LLM integration layer becomes a vector database exfiltration path. The dumps()/dumpd() functions in langchain-core treated user-controlled dict data containing an lc key as a legitimate serialized LangChain object during deserialization.
With secrets_from_env=True (previously the default), deserialization triggered extraction of all environment variables. This includes vector database connection strings, API keys, and cloud credentials.
The attack chain: prompt inject an lc-keyed structure into LLM output, downstream deserialization extracts secrets, attacker receives vector database credentials. Patched: allowed_objects allowlist parameter added to load()/loads(); secrets_from_env now defaults to False; Jinja2 templates blocked by default.
CVE-2024-41892: Pinecone RBAC Post-Retrieval Check Bypass (CVSS 7.5)
The RBAC check executed after data was already returned rather than before. Impact: cross-organization boundary data exposure. Documented: 200,000+ healthcare embeddings accessible outside their intended tenant scope.
Per-Database Hardening: Pinecone, Weaviate, Chroma, Qdrant
Pinecone
- Create one index per tenant. Do not rely on namespace isolation as a security boundary.
- Use project-level RBAC with minimum required permissions. Rotate API keys quarterly.
- Implement application-layer filtering before passing retrieved context to the LLM, but treat this as defense-in-depth, not the primary control.
- Audit key scope: production indexing keys should not have delete permissions.
- Set up audit logging for retrieval patterns. Anomalous query volumes can indicate extraction attempts.
- Upgrade to v1.29.0 or later to access RBAC. Do not use Admin List in production.
- Disable anonymous access explicitly. Weaviate supports API keys and OIDC simultaneously: use OIDC for human access and API keys for service-to-service.
- Enable TLS termination at the Weaviate layer or upstream proxy.
- Apply collection-level permissions aligned with data classification. Not all collections warrant the same access tier.
- Chroma has no built-in authentication in its default open-source distribution. For production: run Chroma behind a reverse proxy with authentication (e.g., NGINX with mutual TLS), or use a managed service that adds an auth layer.
- Do not expose Chroma ports (default: 8000) to any network segment accessible from the internet or from untrusted internal services.
- Consider wrapping Chroma calls in an application service layer that enforces access control before queries reach the vector store.
- Enable API key authentication before any deployment. Qdrant supports both full-access and read-only API keys; use read-only keys in retrieval paths.
- For multi-tenant deployments, use JWT-based access control (v1.9.0+) with collection-scoped tokens.
- Enable TLS. Without it, API keys traverse the network in plaintext.
- Bind to localhost or a private network interface in Docker deployments. The default binds to all interfaces.
Access Control Architecture for Production RAG
The principle of least privilege applies to vector databases the same way it applies to any data store. In practice, we see common patterns fail:
Single API key for all operations. The same key used to ingest documents during setup is left in production configuration and used for retrieval. If the retrieval service is compromised, the attacker has write access to the knowledge base.
No ingestion source validation. Documents are ingested from unvalidated external sources, RSS feeds, or user uploads without inspection. This is the entry point for poisoning attacks. The FinBot manipulation example (2024) used a fake financial blog article designed to cluster near legitimate advice in embedding space.
No query monitoring. High-frequency retrieval of embeddings across many collections in rapid succession is anomalous behavior. Without monitoring, it is invisible.
The access control architecture we recommend for production RAG includes four elements: separate read and write API keys with distinct rotation schedules, network segmentation that prevents the LLM inference tier from direct write access to the vector store, source validation at ingestion with provenance tracking, and embedding-level anomaly detection on retrieval patterns.
For a detailed review of this architecture in the context of a full AI stack assessment, see our AI security audit service or the AI security assessment.
15-Point Vector Database Security Checklist
This checklist reflects what we examine when a vector database layer is in scope for an AI security assessment.
Authentication and Access Control
Network and Transport
Ingestion Pipeline
Monitoring and Detection
Versioning and Patching
secrets_from_env defaults and allowed_objects configuration (CVE-2025-68664) have been reviewedIncluding Vector Database Security in Your AI Security Audit
OWASP LLM08:2025 codified vector and embedding weaknesses as a distinct top-tier risk in the 2025 LLM Top 10. The five sub-vulnerability categories under LLM08 are: unauthorized access to embeddings, multi-tenant context leakage, data poisoning, embedding inversion, and data federation conflicts. This taxonomy gives security teams and auditors a shared vocabulary for scoping work and communicating risk to stakeholders.
For teams conducting internal security reviews, the OWASP GenAI Security Project provides a useful starting framework. For assessments requiring external validation, including vector database scope in the engagement brief is now standard.
The key question for any RAG deployment is not "is the LLM secure?" It is whether the full data path, from source document through ingestion, vector store, retrieval, and LLM context, has been assessed against current attack classes.
The attack surface covered here: embedding inversion from stored vectors, knowledge base poisoning via write access, authentication bypass via CVE, and cross-tenant data leakage via RBAC gaps. Each has documented exploits, some with public PoCs and active CVE tracking.
Conclusion
Vector database security is not a hypothetical future concern. CVE-2025-64513 allowed unauthenticated full administrative access to Milvus with one HTTP header. UpGuard found 406 Chroma databases actively exposing production data to anonymous access in April 2025. PoisonedRAG demonstrated 99% attack success rates with 5 injected documents. Vec2Text demonstrated 92% exact text reconstruction from embeddings.
The default security posture of the leading vector databases is open. Organizations deploying RAG in production carry that risk unless they explicitly harden the configuration.
The 15-point checklist above is a starting point. For a complete assessment of your AI stack, including vector database scope, retrieval pipeline controls, and embedding exposure analysis, run a free AI security scan or contact the BeyondScale team.
Technical references: OWASP LLM08:2025: Vector and Embedding Weaknesses | Morris et al., "Text Embeddings Reveal (Almost) As Much As Text," EMNLP 2023 | CVE-2025-64513, NVD | PoisonedRAG, USENIX Security 2025
BeyondScale Team
AI Security Team, BeyondScale Technologies
Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.
Want to know your AI security posture? Run a free Securetom scan in 60 seconds.
Start Free Scan

