What is the difference between namespace isolation and index-level isolation in Pinecone?

Namespaces in Pinecone are logical partitions within a single index, not security boundaries. A service account with index-level access can query all namespaces in that index. CVE-2024-41892 confirmed this: RBAC checks executed after data retrieval, exposing 200,000+ healthcare embeddings across organization boundaries. True tenant isolation requires one index per tenant with separate scoped API keys.

How does pgvector row-level security work with vector similarity searches?

PostgreSQL's row-level security (RLS) applies to all query types on a table, including vector similarity searches using pgvector's distance operators ( for cosine, for L2). When RLS is enabled and a policy is defined using a session-level tenant identifier, all vector queries automatically filter to rows matching the policy. The application sets the tenant context before querying, and the database enforces isolation at the engine level.

What is CMEK and when is it required for vector databases?

Customer-managed encryption keys (CMEK) mean your cloud provider's KMS (AWS KMS, Google Cloud KMS, or Azure Key Vault) holds the encryption key, not the vector database vendor. Every read and write operation requires a KMS call to unwrap the encryption key. Disabling or revoking the key immediately renders all stored embeddings cryptographically inaccessible. CMEK is typically required for HIPAA, PCI-DSS, and FedRAMP compliance where demonstrating exclusive key control is mandatory.

Which compliance frameworks apply to vector database audit logging?

SOC 2 Type II requires evidence of access controls and audit trails for all data systems, including vector databases. HIPAA requires audit logs for systems handling protected health information, with a 6-year retention minimum. EU AI Act Article 12 requires high-risk AI systems to maintain automatic event logs throughout their operational lifetime, covering what was retrieved, when, and by which principal. GDPR adds requirements around access logging for systems processing personal data.

Can Weaviate multi-tenancy be used for security isolation?

Yes. Weaviate's multi-tenancy feature isolates each tenant's data into a separate shard at the storage layer, not just a logical partition. Data in one tenant's shard is physically isolated from other shards and not accessible to queries scoped to different tenants. This is stronger than collection-level RBAC alone. However, multi-tenancy must be enabled at collection creation and cannot be added to an existing collection.

How should I detect vector database exfiltration attempts?

Configure your SIEM with query-log-based detection rules targeting: high-frequency retrieval from a single service account across many collections within a short window, bulk result sets (legitimate RAG retrieval returns 3-10 documents; 100+ results per query is anomalous), queries arriving outside normal operating hours from unexpected source IPs, and RBAC configuration changes outside approved change windows. Most managed vector databases can export query logs to cloud-native logging services that integrate with common SIEMs.

Vector Database Hardening: Pinecone, pgvector & Weaviate

Vector database security is the infrastructure layer most RAG deployments skip. Teams invest in prompt filtering, output monitoring, and LLM access controls while the database storing their embedded intellectual property runs with default authentication, no audit logging, and minimal network segmentation. This guide covers specific configuration steps across the four major platforms: Pinecone, pgvector, Weaviate, and Qdrant.

For attack classes and CVEs (CVE-2025-64513 in Milvus, CVE-2024-41892 in Pinecone, PoisonedRAG), see our vector database security risk guide. This guide picks up where that one ends: you understand the threats, here is how to configure your way out of them.

Key Takeaways

Embeddings are invertible. Cornell's Vec2Text research (EMNLP 2023) showed 92% of input text can be reconstructed from embeddings alone. Every unauthorized exposure of your vector store is a potential data breach, not just an access control failure.
pgvector supports row-level security at the PostgreSQL layer. This lets you enforce tenant isolation at the database itself rather than in application code, where it is easier to bypass.
Pinecone's six-role RBAC model separates control plane operations (index management) from data plane operations (reads and writes). Most deployments use a single API key for both, which means a compromised retrieval service has write access to the knowledge base.
Weaviate's production-ready RBAC arrived in v1.29.0. Deployments on earlier versions have only two roles: Admin and Read-Only. Many enterprise deployments are still on older versions.
Qdrant ships insecure by default. Its own documentation states: "By default, all self-deployed Qdrant instances are not secure." JWT-based access control, available since v1.9.0, scopes tokens to specific collections.
Customer-managed encryption keys (CMEK) give enterprises cryptographic control: revoking the key in your KMS immediately renders all stored embeddings inaccessible and is required for HIPAA and FedRAMP compliance.
EU AI Act Article 12 requires high-risk AI systems to maintain automatic event logs. Vector database query logs are part of this obligation for systems making consequential decisions.
Embedding API keys (OpenAI, Cohere, Voyage AI) frequently end up in version control, CI/CD logs, and application configs. A compromised embedding key gives an attacker the ability to generate and inject vectors into your knowledge base.

Why Vector Databases Require Infrastructure-Level Security

Most enterprise teams apply security controls at the application layer: the LLM API call is authenticated, retrieved context is filtered, and output is monitored. The vector database sits behind this boundary, treated as a trusted internal service. In practice, "trusted internal service" frequently means no authentication, no audit logging, and network isolation limited to VPC placement.

When the application tier is compromised, everything in the vector store is immediately accessible. The stakes are higher than they appear. Your vector database holds embedded representations of every document your RAG pipeline has indexed: contracts, source code, clinical notes, customer support tickets, financial records, internal wikis. These documents look like numeric arrays in storage, but they are not anonymized.

The Vec2Text method from Morris, Kuleshov, Shmatikov, and Rush at Cornell reconstructs 92% of 32-token input texts exactly from their embeddings using only the embedding API. No model weights required. The 2025 follow-on (arXiv:2504.16609) reported cosine similarity of 89.4 and an LLM Judge leakage score of 92.0 against black-box encoders. A zero-shot variant published in April 2025 (arXiv:2504.00147) achieved meaningful inversion without any training on the target model at all.

A second structural risk compounds this. When your ingestion pipeline pulls documents from Confluence, SharePoint, or an internal wiki, it strips the source document's permission metadata. The original access controls exist in the source system. Your vector database almost certainly has no equivalent controls applied at the document level. Anyone who can query the vector store can retrieve context from documents they should not have accessed.

Infrastructure-level security addresses both risks through access control at the database layer rather than only the application, encryption with customer-managed keys, and audit logging that creates a persistent record for compliance and incident response.

Internal resource: RAG Security and Data Poisoning Guide covers ingestion-layer attack vectors in detail.

Access Control: Per-Platform Configuration

Pinecone: Separating Control Plane from Data Plane

Pinecone's RBAC model separates access into two planes with three roles each:

Control plane (index and infrastructure management):

Owner: full organizational access including billing and user management
Admin: create and delete indexes, manage API keys, configure project settings
Billing Admin: manage billing settings only

Data plane (read and write to index records):

Data Owner: read, write, and delete vectors within assigned indexes
Data Editor: read and write, no delete permissions
Data Viewer: read-only access to index records

The most common misconfiguration is using a single Owner-level API key for both ingestion (which needs Data Editor or Data Owner on specific indexes) and retrieval (which needs Data Viewer only). If the retrieval service is compromised, the attacker has write access to your knowledge base and can inject poisoned embeddings.

Recommended service account pattern:

Ingestion service   → Data Editor key, scoped to ingestion index only
Retrieval service   → Data Viewer key, read-only on retrieval index
Admin operations    → Owner key in secrets manager, not in application config
Key rotation        → Quarterly via Pinecone API, automated through secrets manager

On namespace isolation: Pinecone namespaces are logical partitions within a single index, not security boundaries. A compromised or overly scoped API key with index-level access reaches all namespaces in that index. For security-sensitive multi-tenant deployments, create one index per tenant with a dedicated scoped key. This costs more but provides actual isolation.

pgvector: Row-Level Security for Multi-Tenant RAG

pgvector runs inside PostgreSQL, which means you can apply PostgreSQL's native row-level security directly to your embeddings table. RLS policies enforce tenant isolation at the database engine, not in application code, and they apply to all query types including vector similarity searches.

Basic multi-tenant RLS pattern:

-- Step 1: Add tenant identifier to your embeddings table
ALTER TABLE document_embeddings
  ADD COLUMN tenant_id UUID NOT NULL;

-- Step 2: Enable RLS on the table
ALTER TABLE document_embeddings ENABLE ROW LEVEL SECURITY;

-- Step 3: Create a policy that restricts access to the current tenant
CREATE POLICY tenant_isolation ON document_embeddings
  USING (tenant_id = current_setting('app.current_tenant')::uuid);

-- Step 4: Create separate roles for ingestion and retrieval
CREATE ROLE ingestion_service LOGIN PASSWORD 'strong-password';
CREATE ROLE retrieval_service LOGIN PASSWORD 'strong-password';

GRANT INSERT, UPDATE ON document_embeddings TO ingestion_service;
GRANT SELECT ON document_embeddings TO retrieval_service;

-- Step 5: Application sets tenant context before any query
SET app.current_tenant = '550e8400-e29b-41d4-a716-446655440000';

-- Step 6: Vector similarity searches now automatically filter to tenant rows
SELECT id, content, embedding <=> $1 AS distance
FROM document_embeddings
ORDER BY distance
LIMIT 10;

The RLS policy applies before any data is returned, including the similarity computation. A retrieval_service role without BYPASSRLS attribute cannot access rows outside the currently active tenant, regardless of how the query is constructed.

Additional configuration for production pgvector deployments:

Enable pgaudit for structured audit logging of all SELECT and DML operations on embedding tables
Enforce TLS by setting ssl = on in postgresql.conf and replacing host entries in pg_hba.conf with hostssl
Set log_min_duration_statement = 1000 to log slow queries and monitor for bulk extraction patterns
Do not grant retrieval roles DELETE permissions or schema-level CREATE rights

Weaviate: RBAC and Shard-Level Multi-Tenant Isolation

Weaviate v1.29.0 introduced fine-grained RBAC at the collection level. Earlier versions provided only two roles: Admin and Read-Only. Check your version before assuming you have granular access control.

Enable RBAC and define collection-scoped roles in your Weaviate configuration:

authentication:
  apikey:
    enabled: true
    allowed_keys:
      - roles: ["ingestion-service"]
        key: "${WEAVIATE_INGESTION_KEY}"
      - roles: ["retrieval-service"]
        key: "${WEAVIATE_RETRIEVAL_KEY}"

authorization:
  rbac:
    enabled: true
  roles:
    - name: ingestion-service
      permissions:
        - data: create
          collection: "CustomerDocuments"
    - name: retrieval-service
      permissions:
        - data: read
          collection: "CustomerDocuments"

For stronger isolation in multi-tenant environments, Weaviate's multi-tenancy feature isolates each tenant's data into a separate physical shard. This goes beyond RBAC: data in one shard is not accessible to queries scoped to a different tenant at the storage layer.

Enable multi-tenancy at collection creation (it cannot be added to an existing collection):

import weaviate.classes as wvc

client.collections.create(
    "CustomerDocuments",
    multi_tenancy_config=wvc.config.Configure.multi_tenancy(enabled=True)
)

# Add a tenant
collection = client.collections.get("CustomerDocuments")
collection.tenants.create(wvc.tenants.Tenant(name="tenant_acme"))

# Scope queries to a specific tenant
tenant_collection = collection.with_tenant("tenant_acme")
results = tenant_collection.query.near_text(query="contract renewal terms", limit=5)

Each tenant's data occupies a separate shard. Weaviate supports 50,000+ active shards per node and 1M concurrently active tenants across a cluster, according to published architecture documentation. Inactive tenants are automatically offloaded from memory while their data remains persisted. From a security standpoint, a compromised retrieval service can only access the shards assigned to its tenant.

Qdrant: JWT-Based Collection-Scoped Tokens

Qdrant's documentation is explicit: the default configuration binds to all network interfaces with no authentication. Address both before deploying to any non-local environment.

Minimal secure configuration in config.yaml:

service:
  api_key: "${QDRANT_ADMIN_KEY}"
  read_only_api_key: "${QDRANT_READ_KEY}"

tls:
  cert: /etc/qdrant/certs/server.crt
  key: /etc/qdrant/certs/server.key
  ca_cert: /etc/qdrant/certs/ca.crt

For multi-tenant deployments, JWT-based access control (v1.9.0+) scopes tokens to specific collections with specific permissions. This is more granular than static API keys:

import jwt
import datetime

# Create a read-only token scoped to a single tenant's collection
payload = {
    "exp": datetime.datetime.utcnow() + datetime.timedelta(hours=24),
    "access": [
        {
            "collection": "tenant_acme_documents",
            "access": "r"
        }
    ]
}

token = jwt.encode(payload, qdrant_jwt_secret, algorithm="HS256")

A retrieval service for tenant A receives a JWT scoped to tenant_acme_documents with read access. It cannot query tenant B's collection regardless of how the request is constructed. The token scope is enforced server-side at Qdrant.

Encryption: At Rest, In Transit, and Application Layer

Customer-Managed Encryption Keys

Standard at-rest encryption protects data from physical media theft. CMEK extends that model: your cloud provider's KMS holds the encryption keys, not the vector database vendor.

The architecture works as follows. Your KMS key wraps an Encryption Zone Key (EZK) unique to your database cluster. Every read and write operation requires a call to your KMS to unwrap the EZK before proceeding. If you disable or delete the KMS key, your cluster becomes cryptographically inaccessible within seconds. Every key access event appears in your cloud provider's audit trail: AWS CloudTrail, GCP Cloud Audit Logs, or Azure Monitor.

For compliance: HIPAA requires that covered entities maintain exclusive control over encryption keys for PHI. PCI-DSS v4.0 requires key management procedures that ensure only authorized parties hold decryption keys. FedRAMP requires FIPS 140-2 validated cryptographic modules and full key lifecycle management. CMEK satisfies all three when implemented with a compliant KMS.

Pinecone offers CMEK on Enterprise plans. Zilliz Cloud (managed Milvus) supports CMEK on dedicated Business-Critical clusters. Weaviate Cloud supports BYOK on enterprise tiers. For self-hosted deployments, apply full-disk encryption at the infrastructure layer using cloud provider-managed keys or customer-managed KMS integration.

TLS Enforcement

All vector database traffic should use TLS 1.2 at minimum, with TLS 1.3 preferred for new deployments. Without TLS, API keys and query content traverse the network in plaintext.

For pgvector, enforce TLS in pg_hba.conf:

# Allow only TLS connections from application subnets
hostssl  all  all  10.0.0.0/8  scram-sha-256
# Block non-TLS connections from all external hosts
host     all  all  0.0.0.0/0   reject

For Qdrant, TLS is not enabled by default. The tls block in config.yaml must be explicitly configured and the verify_https_client_certificate option set according to your mTLS requirements.

For managed services (Pinecone Cloud, Weaviate Cloud, Qdrant Cloud), TLS is provided by default. Verify by examining the client connection configuration to confirm no fallback to non-TLS connections is possible.

Embedding API Key Security

Embedding API keys generate the vectors that populate your knowledge base. A compromised embedding key does not just expose data: it gives an attacker the ability to generate and inject arbitrary vectors into your knowledge base, which is the entry point for knowledge base poisoning attacks.

Common exposure patterns: keys hardcoded in application configs that are checked into version control, keys in CI/CD pipeline environment variables that appear in build logs, and single keys shared across development, staging, and production environments.

Controls to implement:

Store embedding API keys in a secrets manager (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager). Retrieve at runtime, not at build time.
Scope embedding keys to the minimum required permissions. For most providers, a key used only for embedding generation does not need access to fine-tuning, model management, or billing APIs.
Create separate keys for ingestion pipelines and any other services that might call the embedding API for other purposes.
Rotate embedding API keys on the same quarterly schedule as your vector database credentials.
Monitor embedding API usage for volume anomalies. A sudden spike in embedding generation requests from a service account that normally handles retrieval only is a signal worth investigating.

Audit Logging and Compliance

What to Log

Vector database audit logs should capture four event categories:

Query events: timestamp, service account or user identifier, collection or index queried, query vector hash (not the raw vector), number of results returned, latency.

Ingestion events: document source URL or identifier, tenant assignment, schema version, ingestion service identity, timestamp.

Authentication events: successful and failed authentication attempts, API key identifier (never the key value itself), source IP, timestamp.

Administrative events: collection creation or deletion, RBAC changes, API key creation or rotation, TLS configuration changes.

For privacy-sensitive deployments, log a hash of the query text rather than the raw text itself. This preserves forensic value for pattern analysis while reducing risk from log exposure.

EU AI Act Article 12

The EU AI Act, effective August 2024, applies to high-risk AI systems including those making consequential decisions in employment, credit, healthcare, and safety-critical infrastructure. Article 12 requires that high-risk systems are designed and developed with capabilities for automatic logging of events throughout their operational lifetime.

For RAG systems contributing to high-risk decisions, this means vector database query logs must be:

Retained for the period specified in the system's technical documentation, often 5 to 10 years for regulated sectors
Covering each use of the system: what was retrieved, when, by which principal, and in response to which query
Available to competent national authorities on request

In practice: export query logs from Weaviate, Qdrant, and pgvector to a tamper-evident log store and configure retention policies to match your jurisdiction's requirements. AWS CloudWatch Logs with Object Lock and S3 in Compliance mode both provide the immutability needed to satisfy audit requirements.

SIEM Integration for Exfiltration Detection

Vector database query logs provide the signal for detecting active exfiltration. Detection rules to configure in your SIEM:

High result-set volume: legitimate RAG retrieval returns 3 to 10 documents per query. A service account returning 100+ results per query consistently is anomalous.
High query frequency: bulk extraction will appear as a spike in query rate from a single service account or source IP.
Off-hours activity: queries arriving outside normal application operating hours from a production service account warrant investigation.
Cross-collection access: a retrieval service that normally queries one collection querying multiple collections in rapid succession is a signal for lateral movement within the vector store.
RBAC modification events: any change to roles, API key permissions, or network access rules outside an approved change window should generate an alert.

Internal resource: AI Security Posture Management (AISPM) covers monitoring architecture across the full AI stack.

Network Isolation

VPC Placement and Private Endpoints

Vector databases should not be reachable from the public internet. The baseline network posture:

Deploy self-hosted vector databases within a VPC with no public IP assignment
For managed services, use cloud-provider private connectivity options:

- Pinecone: AWS PrivateLink on Enterprise plans - Weaviate Cloud: AWS PrivateLink support on enterprise tiers - Qdrant Cloud: private networking on Enterprise plans

Remove or block any load balancer rules that expose vector database ports (Qdrant default: 6333, Weaviate default: 8080, Chroma default: 8000) to internet-facing subnets

A 2025 UpGuard survey found 1,170 internet-accessible Chroma instances, with roughly one-third actively exposing production data to anonymous access. The same pattern appears across Qdrant, Milvus, and Weaviate in publicly reported scans. Default configurations prioritize availability for development; you must apply network isolation explicitly for production.

Segmenting Application Tiers

Network segmentation within the application stack reduces blast radius:

The ingestion service reaches the vector database on write-capable endpoints. It has no access to the LLM inference layer.
The retrieval service reaches the vector database on read-only endpoints. It cannot issue write requests regardless of what the application layer instructs.
The LLM inference layer receives retrieved context from the retrieval service. It has no direct connection to the vector database.

This architecture means that a prompt injection attack that compromises the LLM inference layer cannot directly write poisoned embeddings back into the knowledge base. The attacker must also compromise the ingestion service, which has a separate network path and separate credentials. This adds a meaningful barrier to knowledge base poisoning attacks at the infrastructure layer.

See also: AWS Bedrock Security: Enterprise Configuration Guide for cloud-native AI stack network segmentation patterns.

10-Point Vector Database Hardening Checklist

Apply these controls to any RAG deployment before promoting to production:

Authentication

Enable authentication on all instances. Qdrant: set api_key in config.yaml. Weaviate: enable the API key authentication module. pgvector: replace all trust entries in pg_hba.conf with scram-sha-256.

Create separate service accounts for ingestion (write) and retrieval (read). Do not share credentials between services or environments.

Authorization

Apply the minimum permission scope. Retrieval services need read-only access. Delete permissions belong only to dedicated data lifecycle management services.

For multi-tenant deployments: use database-layer isolation. Weaviate multi-tenancy (shard isolation), pgvector RLS (row isolation), Qdrant JWT collection scoping. Do not treat namespace-level partitioning as a security boundary.

Encryption

Enforce TLS 1.2+ for all connections. Qdrant: configure the tls block in config.yaml. pgvector: use hostssl rules in pg_hba.conf. Verify no plaintext fallback is possible in client libraries.

Enable at-rest encryption. For HIPAA, PCI-DSS, or FedRAMP workloads: require CMEK to maintain exclusive key control.

Store embedding API keys in a secrets manager. Rotate quarterly. Scope to embedding generation only.

Monitoring

Enable query-level audit logging. Export to a tamper-evident store (S3 with Compliance mode, CloudWatch with Object Lock) with a retention period matching your compliance obligations.

Configure SIEM detection rules for bulk retrieval anomalies, off-hours access, and RBAC modification events.

Network

Remove all public endpoints. Deploy within a VPC using private endpoints for managed services. Segment ingestion and retrieval network paths with separate security groups or firewall rules.

Conclusion

Your vector database deserves the same security controls you apply to your primary database. The threat model is at least as severe: the data is sensitive, the default configurations are permissive, and the attack surface includes both the database itself and the embedding API that populates it.

The specific steps differ by platform, but the principles are consistent across all four covered here. Authenticate every connection. Scope every permission to the minimum required. Encrypt at rest and in transit, with customer-managed keys where compliance demands it. Log every query. Isolate the database at the network layer before any production traffic reaches it.

For teams building or auditing RAG infrastructure, a BeyondScale AI security assessment validates your vector database configuration against these controls and identifies gaps across the full stack: ingestion pipelines, retrieval architecture, LLM guardrails, and output monitoring. You can also run a Securetom scan to identify exposed vector database endpoints and misconfigured access controls in your current deployment.

References and further reading: