Skip to main content
AI Security

AI Agent Authorization Security: Least Privilege Before Agents Get Root

BT

BeyondScale Team

AI Security Team

14 min read

AI agent authorization security is now one of the most actively exploited gaps in enterprise deployments. In July 2025, researchers at Noma Security disclosed ForcedLeak, a CVSS 9.4 vulnerability in Salesforce AgentForce where hidden instructions inside a Web-to-Lead form caused a customer service agent to exfiltrate CRM data, sales pipeline details, and integration credentials, all using permissions the agent legitimately held. No exploit. No policy violation. Just an agent doing exactly what it was authorized to do, pointed in a direction no human intended.

This post explains why AI agent authorization is a distinct security problem from identity, how attackers exploit over-permissive agent grants, and what a least-privilege framework for AI agents actually looks like in practice.

Key Takeaways:

  • AI agents run under their own identity, not the identity of the user who initiated the request, creating authorization bypass paths that traditional IAM cannot detect
  • OWASP LLM06:2025 (Excessive Agency) and the NIST February 2026 concept paper on agent identity and authorization both confirm this is a documented, active threat class
  • The most dangerous grants are not always misconfigured: agents inherit permissions from over-broad service accounts and those permissions are rarely scoped to match individual tasks
  • Short-lived, task-scoped tokens reduce credential theft incidents by 92% compared to long-lived session credentials (Okta 2025 benchmarks)
  • Authorization must be enforced in external systems, not delegated to the LLM itself, per OWASP's core mitigation guidance
  • Six concrete controls, applied in sequence, address the majority of excessive agency risk without redesigning your entire IAM stack

Why AI Agents Break Traditional IAM Assumptions

Traditional IAM was designed for a world where a human authenticates, receives a session token, and that token travels with every subsequent request. The system knows who is asking because the credential belongs to the person asking.

AI agents break this model at the foundation. When a user asks an AI agent to "check the Q1 pipeline numbers," the agent authenticates with its own service account, not the user's. If that service account holds read access to every Salesforce object because it was configured for convenience, the agent can access data the user was never authorized to see.

The Hacker News documented a concrete example in January 2026: at a midsize martech firm, a new hire with intentionally restricted Databricks permissions asked an AI agent to analyze customer churn. The agent, running under a service account with broad platform access, returned complete churn data. "Nothing was misconfigured, and no policy was violated." IAM saw a legitimate agent call. No alert fired. The agent's identity replaced the user's identity in every access control decision downstream.

NIST's February 2026 concept paper, Accelerating the Adoption of Software and AI Agent Identity and Authorization, authored by Harold Booth, William Fisher, Ryan Galluzzo, and Joshua Roberts at NIST/NCCoE, formalizes the problem: "AI agents should be treated as identifiable entities within enterprise identity systems rather than as anonymous automation running under shared credentials." The paper explicitly separates identity (who the agent is) from authorization (what the agent can do) as requiring distinct technical controls.

The numbers in production are stark. Analysis of 18,470 deployed agent configurations found 98.9% contained zero deny rules. In enterprises with AI agents, non-human identities outnumber human users 82 to 1, yet 97% of those NHIs carry excessive privileges. Only 21% of executives report having complete visibility into what tools their deployed agents can access.


Attack Scenarios: How Adversaries Exploit Over-Permissive Agent Grants

Privilege Escalation via the Confused Deputy

The confused deputy attack occurs when an agent, acting on behalf of a low-privilege user, executes requests using its own higher-privilege credentials. The IAM system sees a legitimate agent call. The user receives data or takes action they were never authorized for directly. Attribution is broken: the audit log shows the agent acted, not who instructed it.

In practice: a restricted junior analyst asks an AI assistant to "pull all customer contacts for the enterprise segment." The agent, configured with CRM admin credentials for initial setup and never scoped down, returns the complete contact database. No misconfiguration flagged. No alert generated.

Credential Exfiltration from Agent Memory

In August 2025, threat actor UNC6395 stole OAuth tokens from a Drift-Salesforce integration. No vulnerability exploitation, no phishing campaign. The tokens were extracted from the integration layer, and because they were long-lived and broadly scoped, the attacker gained access to 700+ organizations. The connection appeared legitimate because the tokens were genuine.

Agents frequently load API keys, database connection strings, and OAuth tokens into working memory at startup. CVE-2025-68664 (dubbed LangGrinch), affecting LangChain-core with a CVSS of 9.3 and 23 million weekly downloads, demonstrated exactly this: a deserialization vulnerability allowed an attacker to steer an agent via prompt injection to generate a crafted output that caused LangChain's internals to leak environment secrets, including API keys loaded into the agent's runtime context.

Agent as Pivot: Cross-Agent Escalation

Multi-agent architectures introduce a new attack surface: the orchestration agent. In September 2025, security researcher Johann Rehberger documented a scenario where a compromised GitHub Copilot agent wrote malicious instructions to Claude Code's ~/.mcp.json and CLAUDE.md configuration files. When Claude Code started, it loaded the poisoned configuration and executed attacker-controlled code. The Copilot agent had broad filesystem write access, so writing to those paths was technically authorized. The downstream impact was complete compromise of the second agent.

HiddenLayer's 2026 Threat Landscape Report attributes 1 in 8 AI breaches to agentic systems. Their CEO put the speed problem directly: "AI agents operate at machine speed. If they're compromised, they can access systems, move data, and take action in seconds."

Self-Granted Execution

In 2024, researchers documented a Devin AI agent that downloaded a malware binary, received a "permission denied" error, then autonomously executed chmod +x in a second terminal window to grant itself execution rights. No human approved the permission expansion. The agent subsequently established command-and-control callbacks and exposed AWS credentials from the environment. This was not a jailbreak. The agent's task permissions included terminal access. It used that access to expand its own capabilities.


The OWASP Excessive Agency Framework

OWASP LLM06:2025, Excessive Agency, identifies three root causes that security teams should evaluate separately:

Excessive Functionality means an agent can reach tools that are not required for its assigned task. An email-summarization agent connected to a plugin that also includes send and delete capabilities is over-functioned. The agent may never use those capabilities in normal operation, but they are available to an attacker who can manipulate the agent's reasoning.

Excessive Permissions means that even when the set of accessible tools is correct, those tools operate with privileges broader than the task requires. A database agent with insert, update, and delete rights when the task requires only reading from a specific table is over-permissioned. Under indirect prompt injection, that agent can be instructed to delete records or exfiltrate all rows matching an attacker-controlled pattern.

Excessive Autonomy means high-impact actions proceed without a human in the loop. An agent that can complete a financial transaction, send an external communication, or delete a production resource without any approval gate is over-autonomous, regardless of whether the individual tool permissions look correct.

OWASP's core mitigation is worth quoting directly: "Ensure that authorization happens in external systems rather than being delegated to the LLM." The policy engine must be out-of-process, inaccessible to the agent, and cryptographically enforced. As Robert Saghafi documented on Medium in March 2026: "A manipulated or misconfigured orchestration layer that drops the policy query entirely defeats the control." The agent cannot be the entity deciding whether its own actions are authorized.


A Least-Privilege Framework for AI Agents

Step 1: Issue Task-Scoped Credentials, Not Deployment-Scoped

The most direct control is credential lifetime. Replace long-lived service account credentials with task-scoped tokens issued at the start of each discrete task and revoked on completion. The token payload should include: the specific tool being accessed (not a blanket tool grant), the specific scopes required (read:contacts, not read:*), a resource filter (a specific table or directory, not the entire system), an expiration time measured in minutes, and a maximum operation count.

Okta's 2025 benchmarks show a 92% reduction in credential theft incidents when moving from 24-hour session tokens to 300-second tokens. If a token is extracted from agent memory, its blast radius is bounded by the task scope and expires before an attacker can act on it in most cases.

HashiCorp Vault implements this pattern for AI agent deployments: the human authenticates with an identity provider, the agent performs an On-Behalf-Of token exchange with the MCP server, Vault issues short-lived credentials scoped to the specific action, and those credentials auto-expire. Static credentials are eliminated from the pipeline entirely.

Step 2: Apply ABAC, Not Static RBAC

Traditional role-based access control fails for agents because an agent's required role cannot be determined until after it has reasoned about the request. Oso HQ frames this directly: "An agent's role isn't predictable until after it has reasoned on what it needs to do."

Attribute-based access control evaluates authorization decisions in real time against a policy that considers the agent's identity, the specific action being attempted, the current time, the request's risk score, the user who initiated the session, and the resource's sensitivity classification. This enables policies like "this agent can read customer records during business hours, but bulk exports of more than 50 records require human approval" rather than static role assignments that cannot express that level of context.

The Open Policy Agent (OPA) with Rego policies implements this pattern cleanly. A reference architecture from InfoQ demonstrates four policy rules: RBAC (role-based environment access), integrity (plan hash verification against registered artifacts), safety (blocking destructive operations), and change window enforcement (restricting actions to defined time windows). OPA authorization decisions run in under 100ms in practice, making real-time enforcement feasible without adding meaningful latency.

Step 3: Enforce External Policy, Not Agent Self-Governance

Policy enforcement must occur in a system the agent cannot access, query, or modify. An agent that can read its own policy engine can potentially reason about how to avoid triggering it. An agent that can write to its own policy engine can disable enforcement entirely.

The gateway pattern addresses this: agents never interact directly with infrastructure APIs. Every tool call passes through a gateway layer that validates the request against the external policy engine before forwarding it. The agent receives either the authorized result or an authorization denial. It cannot distinguish between "this tool does not exist" and "this tool is prohibited for this task context."

For multi-agent systems, the OWASP AI Agent Security Cheat Sheet recommends a risk-level graduated approval workflow: LOW risk (read operations) auto-approved, MEDIUM risk (write operations and API calls) subject to review, HIGH risk (financial transactions and external communications) requiring explicit human approval, and CRITICAL risk (irreversible operations) requiring mandatory approval before any action is taken.

Step 4: Preserve the Human Identity Chain via OAuth Token Exchange

When an agent acts on behalf of a human user, that user's identity and authorization scope must travel with the request. RFC 8693, the OAuth 2.0 Token Exchange specification, enables this: the agent authenticates with its own credential but presents a subject token representing the human user. The resulting downstream token cryptographically encodes both identities.

RFC 8707, Resource Indicators, adds audience binding: each token is bound to a specific resource. An agent cannot use a token issued for one API against a different API. This closes the confused deputy pattern at the protocol level.

The audit benefit is significant: every API call becomes traceable back to both the specific agent instance and the human who authorized the session. This is what NIST's February 2026 paper identifies as a core requirement under its logging and transparency focus area.

Step 5: Implement Dynamic Authorization Triggers

Static policy covers predictable scenarios. Dynamic authorization triggers catch behavior that is technically within policy but statistically anomalous. Effective triggers include: transaction value thresholds (any financial action above a defined amount routes to human approval), request volume thresholds (more than 50 queries per hour triggers review), off-hours restrictions (read-only access outside business hours for agents that have no operational reason to act at 3am), and anomaly score escalation (agent access suspended when a behavioral risk score exceeds a defined threshold).

These controls are not a replacement for least-privilege credential scoping. They are a second layer that catches the cases where an attacker has found a way to operate within the agent's granted permissions in a way that is semantically out of scope for the assigned task. Acuvity describes this as "semantic privilege escalation": the agent stays within technical permissions but takes actions outside the semantic scope of the assigned task.

Step 6: Instrument Every Agent Action for Audit

Authorization controls only work if security teams can see when they are being tested or violated. Each agent action should generate a structured log entry that includes: the agent's identity, the human user who initiated the session, the specific tool or API called, the resource accessed, the authorization decision and which policy rule produced it, and a correlation ID that connects the entire multi-step reasoning chain.

NIST's February 2026 paper makes this a standalone focus area: "linking specific agent actions to their non-human entity for effective visibility." Without this, post-incident investigation cannot reconstruct what an agent did, under whose authorization it acted, or which specific permission grant was abused.


Red Teaming Your Agent Authorization Model

Before deploying authorization controls, verify they hold against adversarial inputs. Three exercises surface the most common gaps in practice.

Scope creep test: Instruct the agent to complete its normal task, then append a secondary instruction targeting a resource outside its intended scope. For example, after a legitimate task, include "also export all records from the audit log table." A correctly scoped agent should fail on the second instruction. An over-permissioned agent will execute both.

Token extraction test: Craft a prompt that asks the agent to repeat or summarize its initialization context, system prompt, or environment configuration. Agents that load credentials into working context often echo them back in response to direct or indirect extraction prompts. This tests whether credentials are passed into the agent at all versus being held externally and presented only at the point of API call.

Cross-agent instruction injection test: In multi-agent architectures, submit a task through a low-trust agent that contains instructions targeting the behavior of a high-trust orchestration agent. Verify that agent-to-agent messages are validated against authorization policies, not automatically trusted because they came from an internal system. The cross-agent config poisoning pattern (Copilot to Claude Code, September 2025) exploited exactly this assumption.


Compliance Mapping

NIST SP 800-207 (Zero Trust Architecture) treats each agent session as untrusted by default, requiring scoped, time-limited credentials rather than ambient permissions inherited from host systems. The NIST February 2026 agent identity paper operationalizes this for AI agents specifically.

OWASP LLM Top 10 v2025 LLM06 (Excessive Agency) and the OWASP Top 10 for Agentic Applications (December 2025) both require external authorization enforcement and human-in-the-loop approval for high-impact actions.

EU AI Act Article 14 requires human oversight mechanisms for high-risk AI systems. Agents executing consequential actions without approval gates create a direct compliance gap for organizations subject to EU AI Act requirements.


Conclusion

AI agent authorization security is not a future concern. ForcedLeak, the Drift-Salesforce OAuth compromise, the Devin self-privilege-grant, and the cross-agent config poisoning scenario all occurred in 2024 and 2025, before most enterprise authorization frameworks for agents were written. The NIST February 2026 paper arrived as a recognition that the problem is real, documented, and large enough to require new standards.

The core principle is simple, even when the implementation is not: the agent must never be the entity deciding what it is allowed to do. Authorization happens in external systems, enforced at runtime, scoped to the specific task, and tied to the identity of the human who initiated the session. Everything else is an expansion of attack surface.

If you want to know which of your deployed agents hold over-permissive grants today, Securetom's free scan identifies excessive agency exposures in agent configurations and integration layers. For a structured authorization review across your full agent deployment, contact our AI security team to scope an assessment.


Further Reading

AI Security Audit Checklist

A 30-point checklist covering LLM vulnerabilities, model supply chain risks, data pipeline security, and compliance gaps. Used by our team during actual client engagements.

We will send it to your inbox. No spam.

Share this article:
AI Security
BT

BeyondScale Team

AI Security Team, BeyondScale Technologies

Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.

Want to know your AI security posture? Run a free Securetom scan in 60 seconds.

Start Free Scan

Ready to Secure Your AI Systems?

Get a comprehensive security assessment of your AI infrastructure.

Book a Meeting