Skip to main content
AI Security

Azure OpenAI Security: Enterprise Deployment Guide

BT

BeyondScale Team

AI Security Team

12 min read

Azure OpenAI security is an infrastructure-level problem for most enterprises, not just a compliance checkbox. Microsoft's Azure OpenAI Service powers AI features across more than 80,000 enterprise customers globally, including roughly 80% of Fortune 500 companies. That scale makes misconfigurations in Azure OpenAI deployments one of the highest-impact attack surfaces in enterprise technology today.

This guide covers the specific vulnerabilities, misconfigurations, and threat actors targeting Azure OpenAI, along with the controls that security teams need to configure before going to production.

Key Takeaways

    • Azure OpenAI ships with API key authentication enabled, public network access on, and content filters at medium thresholds without prompt injection detection active.
    • The Storm-2139 threat group stole Azure OpenAI keys from public repositories, assessed account value within 9 to 17 minutes, and generated 14,000+ unauthorized DALL-E 3 images before Microsoft's lawsuit in January 2025.
    • CVE-2025-53767 is an SSRF vulnerability in Azure OpenAI that can expose Azure Instance Metadata Service tokens, enabling privilege escalation within the Azure tenant.
    • Managed Identity with Cognitive Services-scoped roles eliminates the API key attack surface entirely.
    • Global and DataZone deployment types process prompts outside the customer's selected region, a critical consideration for GDPR compliance.
    • Fine-tuned models create a persistent backdoor risk that differs fundamentally from inference-time attacks: a poisoned model persists until explicitly deleted.

The Real Threat Landscape: What Attackers Are Doing

Security teams planning Azure OpenAI deployments often focus on hypothetical prompt injection scenarios. The documented threat activity tells a different story.

Storm-2139 and LLMjacking

In January 2025, Microsoft's Digital Crimes Unit filed a federal lawsuit naming seven individuals across Iran, the UK, Hong Kong, and Vietnam for a coordinated Azure OpenAI attack campaign. The operation, attributed to Storm-2139, scraped Azure API keys from public GitHub repositories and phishing campaigns, then resold access to the stolen accounts.

The technical execution was precise. The group built a tool called de3u, a DALL-E 3 frontend with a built-in reverse proxy routed through Cloudflare to avoid origin detection. A second tool, OAI Reverse Proxy, tunneled traffic through a domain they controlled. Once they had a valid 52-character Azure API key, attackers ran GetCostAndUsage API calls within 9 to 17 minutes to assess account value before exposure. Victims faced both unexpected billing charges and content policy violations attached to their accounts for content they never generated.

The mitigation is complete: disable local authentication (disableLocalAuth: true in the ARM or Bicep template) and switch to Microsoft Entra ID with Managed Identity. API keys cannot be stolen if they are never issued.

CVE-2025-53767: SSRF to Privilege Escalation

Azure OpenAI Services has a documented Server-Side Request Forgery vulnerability, catalogued as CVE-2025-53767 (CWE-918). The flaw results from insufficient validation of user-supplied input used to construct server-side requests. An attacker who can trigger the SSRF can direct requests to the Azure Instance Metadata Service (IMDS), the internal endpoint at 169.254.169.254 that returns bearer tokens for the managed identity attached to the Azure resource.

Obtaining those tokens allows the attacker to make authenticated Azure Resource Manager API calls with the permissions of the Azure OpenAI managed identity. If that identity was improperly assigned an Owner or Contributor role at subscription scope (a common misconfiguration), the result is full tenant takeover. Patch status and full CVSS score were not publicly disclosed at time of writing. The immediate defensive action is network-layer restriction: configure the Azure OpenAI private endpoint and disable public network access so the resource cannot reach arbitrary endpoints.

Content Filter Bypass

In October 2024, Mindgard disclosed two vulnerabilities in Microsoft's Azure AI Content Safety Service that allowed attackers to bypass content filtering entirely before the protected model received the request. Microsoft reduced the impact through partial patching. The disclosure confirmed what security researchers have observed more broadly: content filters are probabilistic controls, not deterministic blockers. Treating Azure OpenAI content filters as a security boundary rather than a risk reduction layer is the category error that leads to compliance failures.

Authentication and Identity Configuration

Azure OpenAI supports four authentication methods. Only two should appear in production enterprise deployments.

API Keys (Subscription Keys): 52-character alphanumeric tokens that grant full access to the Azure OpenAI resource. They require no user context, cannot be scoped to specific operations, and are the primary credential type stolen in LLMjacking campaigns. Disable them.

Managed Identity (System-Assigned): The correct production option for Azure-hosted workloads. The managed identity is tied to the resource lifecycle, requires no credential storage, and can be scoped to granular Cognitive Services roles. System-assigned identities are preferred over user-assigned because they cannot be reattached to a different resource.

Microsoft Entra ID with Service Principal: Appropriate when the calling application runs outside Azure, such as an on-premises system or a CI/CD pipeline. Requires careful secret rotation and should use certificate authentication rather than client secrets where possible.

Conditional Access Policies: Entra ID P1 or P2 licenses allow Conditional Access to restrict which identities, device states, and network locations can call Azure OpenAI. This adds a defense-in-depth layer even after credentials are confirmed valid.

The RBAC assignment matters as much as the authentication method. Assign only what each identity needs:

| Role | Use Case | |---|---| | Cognitive Services OpenAI User | Inference: calling the model API | | Cognitive Services OpenAI Contributor | Model management: deploying and fine-tuning | | Cognitive Services Contributor | Resource administration: creating deployments |

Never assign Owner or Contributor at subscription scope to a managed identity used by Azure OpenAI. See Microsoft's official RBAC guidance for Azure OpenAI for the complete role matrix.

Network Security Architecture

Default Azure OpenAI deployments accept traffic from any IP address over the public internet. A production configuration should look nothing like this.

Private Endpoint Configuration

Deploy Azure OpenAI with a private endpoint in your Virtual Network:

  • Set publicNetworkAccess: Disabled on the resource.
  • Create a Private Endpoint in the target VNet.
  • Configure a Private DNS Zone (privatelink.openai.azure.com) to resolve the resource hostname to the private IP.
  • For on-premises access, route through Azure VPN Gateway or ExpressRoute, not public internet.
  • This configuration removes the CVE-2025-53767 SSRF attack surface at the network layer by ensuring the resource cannot reach arbitrary internet endpoints.

    API Gateway with Azure API Management

    Placing Azure API Management (APIM) between callers and Azure OpenAI adds several security controls that are not available at the resource level:

    • Token rate limiting: The azure-openai-token-limit policy enforces per-key token consumption limits. Exceeding the rate limit returns HTTP 429; exceeding a quota returns HTTP 403.
    • Schema validation: Reject malformed requests before they reach the model.
    • Request logging: Capture request metadata in Azure Monitor without logging prompt content.
    One architecture-specific caveat: in multi-region APIM deployments, each regional gateway maintains a separate token counter. Token limits are not globally enforced across regions. An attacker who routes requests through multiple regional gateways can exceed limits intended to prevent abuse. Design token budgets with this gap in mind, or use a centralized gateway topology.

    Content Filtering and Prompt Security

    Azure AI Content Safety provides four harm category filters (violence, hate, sexual, self-harm) at four severity levels. The default deployment threshold is Medium, which blocks medium and high severity content for both inputs and outputs.

    Default filters do not include prompt injection detection. Enabling prompt injection detection requires configuring Prompt Shields, a separate feature that provides two detection modes:

    • Direct attack detection: Identifies instructions in user messages that attempt to override system prompts (jailbreaking).
    • Indirect attack detection (XPIA): Identifies instructions embedded in documents, emails, or retrieved content that attempt to hijack the model when processed by a RAG pipeline.
    For deployments using Azure OpenAI "On Your Data" with Azure AI Search, Blob Storage, or Cosmos DB, indirect injection detection is not optional. Every document in the retrieval corpus is a potential injection vector. The Azure AI Content Safety documentation on Prompt Shields covers the configuration steps.

    Microsoft announced Spotlighting at Build 2025, a technique that applies encoding (including base-64) to retrieved document content to signal lower trust to the model. This helps the model differentiate between trusted system prompt content and untrusted retrieved content, reducing the effectiveness of indirect injection attacks without blocking retrieval entirely.

    For internal context on how prompt injection defenses layer together, see our indirect prompt injection defense guide and the BeyondScale AI security assessment for organizations running Azure OpenAI in regulated environments.

    Data Residency and Privacy Configuration

    Three deployment types exist for Azure OpenAI, and the differences matter for GDPR and data sovereignty compliance.

    Regional: Prompts are processed only within the customer's selected Azure region. The correct choice for any data with legal residency requirements.

    Global: Prompts may be processed in any geography where the model is available to improve throughput and availability. Data at rest stays in the designated geography; processing location is variable.

    DataZone: Prompts may be processed anywhere within a named data zone (example: the EU data zone covering all EU member nations). Provides geographic scope but not country-level residency guarantees.

    Enterprises subject to GDPR Article 46 transfer requirements should use Regional deployments only and document the specific Azure region in their Records of Processing Activities (RoPA).

    On abuse monitoring: by default, Microsoft may store samples of flagged prompts for human review by authorized employees accessing data through Secure Access Workstations via Just-In-Time access. Enterprise Agreement and Microsoft Customer Agreement customers can apply to disable content logging. This opt-out has a compliance tradeoff: opting out reduces exposure to insider review of sensitive content, but the organization assumes sole legal responsibility for any abuse that would otherwise have been detected.

    Verify opt-out status programmatically:

    az cognitiveservices account show -n <account-name> -g <resource-group> \
      --query "properties.capabilities[?name=='ContentLogging']"

    A result of "value": "false" confirms content logging is disabled.

    Microsoft's data handling commitments are documented in the official data privacy page for Azure OpenAI.

    Fine-Tuning Pipeline Security

    Fine-tuning is the most underprotected capability in Azure OpenAI deployments. Unlike inference-time attacks, a security compromise in the fine-tuning pipeline creates a persistent threat: a backdoored model that behaves normally on most inputs but produces attacker-controlled outputs when it encounters a specific trigger phrase.

    Three categories of risk apply:

    Training data poisoning: An attacker with write access to the fine-tuning data pipeline can embed backdoors. Azure AI Foundry runs a safety evaluation after fine-tuning completes, but the evaluation only reports pass/fail, not the specific content that was evaluated. Treating this safety check as equivalent to a security audit overstates its scope.

    Membership inference: Adversaries with black-box access to a fine-tuned model can make statistical inferences about whether specific data records were included in training. This is a known risk when fine-tuning on datasets containing PII or proprietary business data.

    Data relocation during fine-tuning: Microsoft disclosed in March 2025 that fine-tuning operations may involve temporary data relocation outside the customer's selected geography. This disclosure was in the Terms of Service update and received limited coverage. Enterprises with strict data residency requirements for training data need to verify the current policy before initiating fine-tuning jobs on sensitive datasets.

    Mitigations: implement write-access controls on fine-tuning data storage (Azure Blob Storage with strict RBAC), log all fine-tuning job initiations to Azure Monitor, and verify model behavior against a fixed evaluation set before promoting any fine-tuned model to production.

    Monitoring and Threat Detection

    A complete Azure OpenAI monitoring stack includes three layers:

    Azure Monitor Diagnostic Logs: Captures API request metadata, token usage, content filter decisions, and error rates. Does not log prompt or completion content by default. Configure log retention and export to a Log Analytics Workspace.

    Microsoft Defender for Cloud with AI Security Posture Management (AI-SPM): Discovers Azure OpenAI deployments across the tenant, generates an AI bill of materials, identifies misconfigurations (public access enabled, missing private endpoints, over-privileged RBAC), and provides attack path analysis.

    Microsoft Defender for AI Services (GA: May 2025): Runtime threat detection aligned with the OWASP LLM Top 10. Detects direct and indirect prompt injections, ASCII smuggling attacks (where Unicode encoding obscures malicious instructions), malicious URL generation in completions, suspicious API access patterns, and wallet abuse through token consumption anomalies.

    Configure all three layers. AI-SPM finds the configuration problems that exist before an attack occurs. Defender for AI Services finds the attacks that succeed despite correct configuration. Azure Monitor provides the audit trail for incident investigation.

    Security Baseline Checklist

    Before any Azure OpenAI deployment reaches production, verify:

    • [ ] disableLocalAuth: true set on the resource (API keys disabled)
    • [ ] Managed Identity configured with Cognitive Services-scoped RBAC only
    • [ ] Private endpoint deployed; publicNetworkAccess: Disabled
    • [ ] Regional deployment type selected (not Global or DataZone)
    • [ ] Prompt Shields enabled in content filter configuration
    • [ ] Indirect injection detection enabled for RAG/On Your Data deployments
    • [ ] Azure API Management deployed with token rate limits configured
    • [ ] Diagnostic logs configured and exported to Log Analytics Workspace
    • [ ] Microsoft Defender for Cloud with AI-SPM enabled on subscription
    • [ ] Content logging opt-out applied if required by data classification policy
    • [ ] Fine-tuning data stored in Blob Storage with strict RBAC and write-access logging
    • [ ] Customer-Managed Key configured for data at rest if required by policy

    Conclusion

    Azure OpenAI security requires deliberate configuration at every layer. The default deployment state exposes API keys, public network access, and minimal injection detection. The documented incident history, from Storm-2139's LLMjacking campaign to the CVE-2025-53767 SSRF chain, shows that attackers are actively exploiting Azure OpenAI misconfigurations at scale.

    The security controls that matter most: disable API key authentication, deploy private endpoints, restrict RBAC to Cognitive Services roles, enable Prompt Shields, and use Regional deployments for data with residency requirements. These are not expensive controls. They are configuration changes that eliminate entire attack categories.

    If your organization is deploying Azure OpenAI and needs an independent assessment of your current configuration, BeyondScale provides AI security assessments specifically designed for enterprise LLM deployments, covering authentication, network architecture, content filtering, and fine-tuning pipeline security.

    Share this article:
    AI Security
    BT

    BeyondScale Team

    AI Security Team, BeyondScale Technologies

    Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.

    Want to know your AI security posture? Run a free Securetom scan in 60 seconds.

    Start Free Scan

    Ready to Secure Your AI Systems?

    Get a comprehensive security assessment of your AI infrastructure.

    Book a Meeting