LLMjacking attacks cost victim organizations up to $100,000 per day in stolen AI compute, and the frequency is accelerating. Since the Sysdig Threat Research Team coined the term in May 2024, what began as opportunistic credential theft has evolved into a structured criminal industry complete with commercial marketplaces, automated scanning infrastructure, and monetization pipelines targeting OpenAI, Anthropic, AWS Bedrock, and Azure OpenAI accounts.
In this guide, you will learn precisely how LLMjacking attacks are staged, what attackers do once they have access, and which controls at the secrets management, identity, and monitoring layers reliably stop them.
Key Takeaways
- Sysdig's original research documented $46,080 per day in costs for Claude 2.x attacks; Claude 3 Opus targets push that figure above $100,000 per day.
- Over 39 million secrets were exposed on GitHub in 2024. Attackers detect newly committed keys in under four minutes.
- The OAI Reverse Proxy (oai-reverse-proxy) is the central technical enabler: attackers resell stolen LLM access without exposing credentials to buyers.
- Operation Bizarre Bazaar (December 2025 to January 2026) captured 35,000 attack sessions and documented a three-stage criminal supply chain: scanner, validator, marketplace.
- AWS CloudTrail logs for
DeleteModelInvocationLoggingConfigurationcalls are a near-certain indicator of active attacker presence. - Short-lived credentials, per-model IAM restrictions, and protected invocation logging are the highest-return preventive controls.
- OWASP LLM10:2025 Unbounded Consumption formalizes this threat class with mitigations at four pipeline stages.
What LLMjacking Is and Why It Exploded
The Sysdig Threat Research Team published the first documented case in May 2024. Attackers had exploited CVE-2021-3129, a CVSS 9.8 remote code execution vulnerability in Laravel's Ignition debug mode, to compromise a server, steal AWS credentials, and invoke AI services across 10 different cloud providers in a single campaign.
The economics are straightforward. Attackers acquire stolen LLM API credentials for as little as $30 per account on underground forums, then resell access at 40 to 60% below legitimate pricing. Buyers pay once; the victim organization receives the compute bill. A single Claude 3 Opus ORP instance running over 4.5 days in one Sysdig-documented case consumed 2.2 billion tokens, approximately $50,000 in input and output costs combined.
The threat has since matured. By late 2025, Pillar Security's research on Operation Bizarre Bazaar documented a fully professionalized three-stage criminal supply chain: distributed scanning bots that probe for exposed endpoints using Shodan and Censys, automated validation scripts that test discovered credentials against AI21 Labs, Anthropic, AWS Bedrock, Azure, ElevenLabs, Mistral, OpenAI, and GCP Vertex AI, and a commercial marketplace called silver.inc that resells access to 30+ providers via bulletproof hosting in the Netherlands.
The threat is not slowing. In March 2026, a developer reported an $82,000 Gemini API bill generated in 48 hours from a single stolen key.
How Attackers Steal AI Credentials
Repository and paste site scanning. The most common initial access vector is automated scraping of public GitHub repositories, Pastebin, Reddit threads, and Common Crawl archives. GitHub disclosed that over 39 million secrets were leaked across its platform in 2024 alone. GitGuardian documented a 1,212x increase in OpenAI API key leaks compared to 2022. The median time from a secret commit to external actor detection is under four minutes. Only 2.6% of secrets are revoked within one hour of detection, leaving a wide exploitation window.
Application vulnerabilities. The original Sysdig case used CVE-2021-3129. Similar chains exploit other web application vulnerabilities to gain server access, read environment variables or .env files, and exfiltrate API keys stored in plaintext.
CI/CD pipeline exposure. Build logs, artifact stores, and environment variable injection points in GitHub Actions, GitLab CI, and Jenkins frequently contain API keys that developers set for testing. These persist in logs well after the key has been rotated in the application environment.
Misconfigured self-hosted endpoints. Ollama instances (default port 11434) and OpenAI-compatible API servers (port 8000) exposed without authentication are direct targets. Pillar Security's honeypots captured nearly 1,000 attack sessions per day during Operation Bizarre Bazaar, the majority probing for these endpoints.
Once credentials are acquired, attackers run custom Python validation scripts (_OAIdragoChecker.py, _AWSdragoChecker.py, _AZUREdragoChecker.py) that enumerate accessible models, assess account spending limits, and categorize credentials by value before deciding whether to use them directly or sell them. The fastest observed credential-to-first-exploitation timeline in published research is nine minutes.
The Reverse Proxy Monetization Infrastructure
The technical component that transformed LLMjacking from theft into a scalable business is the open-source oai-reverse-proxy tool. It acts as a gateway that accepts API calls from buyers, forwards them through stolen credentials to legitimate AI providers, and returns responses, without ever exposing the underlying key to the buyer.
This architecture solves the attacker's trust problem: a buyer pays for access and cannot extract or resell the credential they are indirectly using.
The ORP also performs active detection evasion. By default, it checks whether AWS Bedrock model invocation logging is enabled for a compromised account and skips the account if logging is active. Attackers additionally call DeleteModelInvocationLoggingConfiguration directly to disable CloudWatch and S3 logging, erasing the evidence trail before exploitation begins.
ORP servers are exposed to buyers through TryCloudflare tunnels using auto-generated, rotating domain names (for example, examined-back-breakdown-diabetes.trycloudflare[.]com). These rotate with each tunnel session, making blocklist-based defenses ineffective without behavioral controls.
Distribution happens through Discord, Telegram, 4chan, and Rentry.co. CSS obfuscation on some proxies requires buyers to disable stylesheets to access the interface, providing minimal friction to buyers while adding one layer of obscurity from automated crawlers.
What Attackers Do With Stolen Access
In one Sysdig analysis of monitored prompts, 95% were adult-oriented roleplay content, reflecting demand from users bypassing content filters. Other documented uses include OCR and image analysis, generating or improving exploitation scripts (attackers used Claude 3 in one documented case to write new credential-checking tools), and accessing services by nationals under sanctions restrictions who cannot create legitimate accounts.
A significant shift observed in Operation Bizarre Bazaar is worth noting: by late January 2026, 60% of attack traffic had shifted from compute theft toward MCP (Model Context Protocol) reconnaissance, probing file systems, databases, shell access, API integrations, and Kubernetes clusters. LLMjacking is increasingly a staging ground for deeper compromise, not just a billing attack. See the MCP server security guide for the specific controls that apply to that exposure surface.
The denial-of-wallet dimension is significant even when data exfiltration is not the goal. OWASP LLM10:2025 documents a case in which a startup received 100,000+ requests in 48 hours that generated a $200,000 bill, forcing shutdown. Even a brief exploitation window before detection can be financially catastrophic.
Detection: What to Monitor
AWS CloudTrail signatures. A DeleteModelInvocationLoggingConfiguration event is a near-certain indicator of active attacker presence. Attackers also call GetFoundationModelAvailability, ListFoundationModels, and GetCostAndUsage during the reconnaissance phase; monitoring for these calls from unexpected IAM principals surfaces intrusions early.
Invocation log anomalies. Enable AWS Bedrock model invocation logging and protect the configuration with a deny policy on DeleteModelInvocationLoggingConfiguration. Review logs for: unexpected model IDs (high-tier models never used in production), unusual prompt patterns, high-volume identical requests, and max_tokens_to_sample: -1 parameter values, a behavioral signature observed in documented attacks.
Billing and usage alerts. Configure AWS Cost Explorer and Azure Cost Management to alert on anomalous spending. A sudden spike in AI inference costs, particularly in regions where your application does not operate, is a reliable early signal.
Behavioral analytics. Tools like Falco and Sysdig Secure provide runtime behavioral signatures for cloud credential abuse. Alerting on multi-provider enumeration attempts in rapid succession catches the validation phase before exploitation begins.
The AI incident response playbook covers the full response workflow once an alert fires, including how to scope the blast radius and whether to pursue cost dispute with the cloud provider.
Prevention Architecture
Short-lived credentials over static API keys. HashiCorp Vault's AWS secrets engine issues dynamic, time-limited IAM credentials that auto-revoke when the Vault lease expires. AWS IAM roles with temporary STS credentials are preferable to static access keys for all AI service calls. There is no valid reason for a production application to use a long-lived OpenAI API key stored as an environment variable.
Secrets management. AWS Secrets Manager encrypts secrets at rest, enforces access through IAM policies, and supports automatic rotation. Any secret stored in a .env file, committed to a repository, or printed in a CI/CD log is a liability.
Repository scanning. GitHub's built-in secret scanning with push protection blocks commits that match known secret patterns before they reach the repository. GitGuardian provides broader monitoring across repositories, collaboration tools, and artifact stores. Given that attackers detect new secrets in under four minutes, pre-commit blocking is the only reliable control at this layer.
Least privilege for AI service accounts. Restrict IAM roles to specific model ARNs and specific AWS regions only. Use separate service accounts per application: a key compromised from one service should not provide access to AI services used by another. Explicitly deny DeleteModelInvocationLoggingConfiguration in your IAM policy to protect your audit trail.
Network controls. Never expose Ollama instances, self-hosted LLM endpoints, or MCP servers directly to the internet. Restrict InvokeModel API calls to known application IP ranges. Block AS135377 subnets and the 204.76.203.0/24 range associated with the silver.inc operation per Pillar Security's indicators of compromise.
This connects to the broader principle of non-human identity security for AI agents: every AI API key is a non-human identity with its own access scope, rotation schedule, and audit requirements.
Incident Response
When you confirm a LLMjacking incident, the first priority is credential rotation. Revoke not just the identified key but all keys in the environment. Assume lateral movement to any resource the compromised credential could access.
Pull CloudTrail logs for the affected service account. Look for DeleteModelInvocationLoggingConfiguration events indicating log clearing. If invocation logs were not deleted, review them for what models were called, what prompts were submitted, and what data was returned. Prompt logs can contain sensitive application data depending on how your application constructs API calls.
Identify and patch the initial access vector. If credentials were in a public repository, treat them as compromised from the moment of the first commit, not from when you noticed the billing spike.
Notify your cloud provider through their abuse channel. AWS, Azure, and GCP have documented processes for reviewing charges resulting from credential theft and may waive costs with prompt reporting and evidence of a security incident.
Audit all IAM principals created during the compromise window. Attackers frequently create secondary access mechanisms (new IAM users, access keys, or service accounts) to maintain persistence after the initial credential is rotated.
Conclusion
LLMjacking has moved from opportunistic credential theft to organized criminal infrastructure in under two years. Operation Bizarre Bazaar documented the shift: a commercial marketplace, distributed scanning bots, automated validation pipelines, and a threat actor monetizing stolen AI access at scale across 30+ providers.
The good news is that the attack chain has clear, addressable weak points. Eliminating static long-lived API keys from repositories and CI/CD pipelines closes the most common initial access vector. Protecting and monitoring model invocation logs catches active exploitation before costs become catastrophic. Restricting IAM permissions to specific model ARNs and regions limits blast radius when credentials are inevitably exposed.
If you want to know whether your organization has exposed AI API keys in git history, build artifacts, or misconfigured endpoints, book a BeyondScale AI security assessment. Our audits include API key exposure scanning and LLMjacking risk assessment as standard components.
You can also run a free scan to get an initial read on your AI attack surface.
Sources: Sysdig LLMjacking original report (May 2024); Pillar Security Operation Bizarre Bazaar (2026); OWASP LLM10:2025 Unbounded Consumption; NIST AI RMF.
AI Security Audit Checklist
A 30-point checklist covering LLM vulnerabilities, model supply chain risks, data pipeline security, and compliance gaps. Used by our team during actual client engagements.
We will send it to your inbox. No spam.
BeyondScale Team
AI Security Team, BeyondScale Technologies
Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.
Want to know your AI security posture? Run a free Securetom scan in 60 seconds.
Start Free Scan

