Coding Agent Sandboxes Don't Solve Credential Authorization

Sandboxing a coding agent isolates it from the host—but the real blast radius is the credentials it holds. GitHub tokens, cloud keys, MCP connections, and CI/CD access define what an agent can actually do. Here's the runtime permission model that closes the gap.

Or Weis

Jun 12 2026

A lot of teams are having the same uncomfortable conversation right now: we moved our coding agent into a locked-down container, so why does security still feel fragile? The answer is that container hardening and VM isolation solve one class of problem—the host compromise problem—but not the authority problem. If your agent can still push to protected branches, publish packages, deploy infra, read inboxes, or call high-power MCP tools, the blast radius is defined by those privileges, not by namespace isolation.

That's the architectural tension behind secure coding agents today. We're very good at talking about runtime boundaries, but still too casual about coding-agent credentials. You can run Claude Code security controls perfectly at the OS layer and still hand the agent effective admin authority through GitHub tokens, cloud credentials, and browser-backed sessions. That gap is where most serious failures will happen.

Host isolation vs authority isolation

Host isolation and authority isolation are different security properties that happen to get discussed as if they were one. Host isolation is about containing code execution: containers, microVMs, seccomp profiles, egress controls, read-only mounts, and process-level guardrails. It answers, "if this process is malicious, how far can it move on this machine?"

Authority isolation answers a different question: "what can this process do through legitimate APIs and trusted control planes?" An agent doesn't need a kernel escape if it already has a token that can merge to main, trigger production deploy, or read secrets from a vault. In practice, most damaging incidents won't look like classic host compromise—they'll look like authorized misuse.

This is the industry mistake worth naming in plain language: coding-agent sandboxing is necessary, but it addresses a different security dimension than coding agent authorization. You can be perfect on one axis and dangerously weak on the other. A fully isolated agent with a high-scope GitHub PAT and cloud CLI creds is still a high-authority operator.

The real credential surface of a coding agent

Most teams underestimate the credential surface because they reason only about what they explicitly passed to the agent in environment variables. But modern coding workflows leak authority through many side channels: local CLIs already authenticated, browser sessions already warm, CI tokens in repo secrets, and MCP servers that proxy additional capabilities. Least privilege for coding agents has to model all of that, not just .env.

The discussion around Claude Code security has already surfaced this reality in public practitioner forums. The Claude Fable Hacker News thread has multiple engineers explicitly calling out Gmail access, password-reset abuse, browser-profile exposure, and .env/MCP pathways as the real concern—not merely "can it delete local files." That framing is directionally correct: the dangerous path is often credential reuse, not sandbox breakout.

The credential inventory that matters in practice looks like this:

GitHub App tokens and PATs: Scope is everything. repo:read and metadata are very different from contents:write, pull_requests:write, or org-admin permissions. GitHub agent permissions should be task-scoped and repo-bounded by default.
Package registry credentials (npm, PyPI, crates.io, etc.): This is distribution-plane risk. A compromised publish token can ship malicious artifacts to downstream consumers even if source control remains clean.
Cloud CLI/API credentials (AWS, GCP, Azure): Access keys, service-account credentials, federated sessions, and managed identities all become infrastructure authority when reachable by an agent runtime.
Email/SMTP/Gmail credentials: Email authority is meta-authority. It enables phishing, workflow impersonation, and password-reset interception across unrelated systems.
Browser session cookies and OAuth tokens: Active browser sessions can bypass fresh MFA prompts and effectively hand over already-authenticated state.
CI/CD tokens and pipeline identities: These are operational credentials: run builds, inject artifacts, modify release flow, and deploy.
MCP server connections: MCP server security is now core, not optional. MCP tools can amplify authority by proxying to systems the agent otherwise couldn't reach.
SaaS API keys (Jira, Slack, Notion, etc.): These create organization-wide side effects—ticket churn, notification abuse, data exposure, and social engineering opportunities.

The risk is not theoretical. Obsidian Security's LiteLLM privilege escalation research demonstrates a path from low-privilege access to admin-level control and then forged downstream tool execution against agents like Claude Code, including MCP-related execution paths. And Antimetal's automation implementation writeup makes the opposite but equally important point: autonomous systems increasingly do need direct authenticated production access to deliver value. This is architectural reality, not fearmongering.

Classifying coding-agent tool calls by risk tier

A tiered model exists because "tool call" is not a meaningful risk unit. cat README.md and "merge to main + deploy prod" are not siblings; treating them as equivalent is how teams end up with catastrophic default approvals. Tool trust levels are the bridge between policy language and operational controls.

This is especially relevant for Cursor MCP security and Claude Code security deployments where users expect fluid interaction. If you put the same friction on every call, people disable controls. If you put no friction on high-impact calls, incidents become inevitable. Risk-tiered controls are the only workable middle path.

Tier	Examples	Risk profile	Typical control
Read/List	`git clone`, `git log`, `grep`, `ls`, `cat`	Low; observational; no direct side effects	Auto-allow with logging
Edit/Write	file write, `git commit`, branch push	Medium; reversible but defect-introducing	Policy allow + scope checks
PR/Review	open PR, request reviewers, issue-state changes	Elevated; org/social surface	Conditional allow, stronger identity binding
Merge/Deploy	merge protected branches, trigger CI/CD, deploy envs	High; business impact, potentially irreversible	Human approval for agent actions
Secret/Credential access	read secret manager, write `.env`, rotate keys	Critical; privilege amplification	Explicit approval + JIT grant
Destructive shell	`rm -rf`, `DROP DATABASE`, infra teardown	Critical/irreversible	Default deny or break-glass approval

Once tiers exist, policy becomes legible: read paths stay fast; write paths enforce resource scoping; merge/deploy and secret/destructive paths require explicit, invocation-specific authorization. This is what coding agent authorization should look like in production: proportional control, not blanket paranoia and not blanket trust.

Zero standing permissions and delegated access

Zero standing permissions for AI agents means exactly what it says: the agent starts with no persistent authority. At provisioning time, it gets identity, telemetry hooks, and policy context—but not long-lived GitHub PATs, cloud keys, registry publish tokens, or always-on SaaS credentials. Capability is granted only when needed.

In a mature setup, access is just-in-time, narrowly scoped to the declared task, and auto-expired by default. If an agent is assigned "prepare PR for bugfix in repo X," then rights should be limited to that repo, that branch pattern, that operation class, and a short validity window. If it stalls or deviates, grants should time out and revoke automatically.

Delegated access is the other half. The agent acts on behalf of a human principal, not as a superuser shadow account. That means effective rights must be the intersection of: the human's permissions, declared task scope, tool tier policy, and temporal window. If the delegating engineer cannot merge to production, the agent must not be able to merge either. If the task is read-only triage, write pathways stay closed.

This is the opposite of today's common anti-pattern: stuffing durable secrets into agent environment variables and calling it enablement. That model quietly creates standing privilege, turns prompt injection into privilege abuse, and makes post-incident attribution ambiguous. Zero standing permissions plus delegated authorization gives you a principled default: no authority without explicit, contextual, time-bounded intent.

Runtime enforcement with Permit.io

At this point, policy has to move from documentation to runtime decisions. Permit.io is useful here as an enforcement plane because each tool invocation can be evaluated in real time by a policy decision point (PDP) using the full context: human identity, agent identity, session/task metadata, requested operation, target resource, and risk tier.

That is the key shift from static secrets to live authorization. Instead of "token present => action allowed," the system asks, "is this exact call allowed now, by this agent, for this delegated human, on this resource, under this task scope?" For secure coding agents, this is the difference between binary trust and continuous policy.

High-risk operations should also support invocation-specific human approval flows. The agent pauses on write/destructive or secret-tier actions and requests approval for that one action with full parameters visible. This avoids dangerous blanket session consent and gives reviewers a chance to reject suspicious deltas like "merge + deploy + secret read" chained together.

A credible implementation also needs forensic-grade auditability. Every meaningful call should produce a record binding principal, delegation chain, tool input, policy decision, and outcome. A blocked merge example might look like:

{
  "event_id": "evt_9f3b2c1d",
  "timestamp": "2026-06-12T18:14:22Z",
  "human_principal": {
    "id": "user_12345",
    "email": "dev@example.com",
    "role": "engineer"
  },
  "agent": {
    "id": "agent_claude_code_07",
    "session_id": "sess_a81d",
    "task_id": "task_fix_checkout_timeout"
  },
  "request": {
    "tool": "github.merge_pull_request",
    "operation": "merge",
    "resource": "github://org/payment-service/pull/842",
    "tier": "merge_deploy",
    "parameters": {
      "merge_method": "squash",
      "target_branch": "main"
    }
  },
  "policy_decision": {
    "result": "deny",
    "reason": "human_principal_missing_permission_for_protected_branch",
    "matched_rule": "merge_requires_maintainer_and_change_window",
    "policy_version": "2026-06-12.4"
  },
  "jit_grant": {
    "status": "not_issued",
    "ttl_seconds": 0
  },
  "outcome": "operation_blocked_before_execution"
}

When tied to JIT grants keyed by session ID, expiring automatically and revocable mid-task, this model gives teams practical least privilege for coding agents without killing developer velocity. Read calls stay smooth, write calls require scope integrity, and secret/destructive calls require explicit human intent.

Frequently asked questions

Security teams and platform engineers are converging on a shared vocabulary, but implementation details are still uneven. The questions below are the ones that usually determine whether a program is genuinely safe or just cosmetically hardened.

Is sandboxing enough to secure a coding agent?

Sandboxing limits host-level compromise paths, but it does not limit what the agent can do with valid credentials. If the agent holds high-scope tokens, it can still perform high-impact actions through trusted APIs without any host escape. Effective security requires both host isolation and authority isolation—treating them as the same thing is how teams build a hardened container that still holds production keys.

What credentials should a coding agent be allowed to use?

Only short-lived, task-scoped credentials issued just in time for specific operations. Long-lived PATs, static cloud access keys, and persistent registry publish tokens should not live inside agent environments. The right model is zero standing permissions at provisioning time, with grants issued narrowly by tier and task and revoked automatically when the task completes or times out.

How does MCP server access change the risk profile?

MCP connections can significantly expand effective authority because they proxy external capabilities into the agent's toolset. If an MCP path is compromised—as Obsidian Security demonstrated with LiteLLM—the agent may execute actions it was never intended to perform directly, including forged tool calls against downstream systems. That is why MCP server security and per-tool authorization are core controls, not optional add-ons for high-risk deployments.

What are zero standing permissions for coding agents?

Zero standing permissions means the agent has no durable privileged credentials at rest between tasks—no always-on PATs, no persistent cloud keys, no live registry publish tokens in the environment. Access is granted at runtime, bounded in scope and time, then revoked automatically. This approach reduces credential theft impact and eliminates the standing privilege that makes prompt injection so dangerous.

How should teams implement human approval for destructive agent actions?

Teams should use invocation-level approvals rather than blanket "approve session" toggles. The reviewer needs to see the exact tool name, parameters, target resource, and policy context before approving—not just "the agent wants to do something." Approvals should be logged, time-bound, and invalidated if request parameters change between approval and execution.

What does a proper audit trail for coding-agent actions look like?

A strong audit trail binds human principal, agent identity, session and task identifier, requested operation, resource scope, policy result, matched rule, policy version, timestamp, and execution outcome into a single queryable record. Missing any of these fields weakens attribution during incident response. The goal is to be able to answer "who authorized what, under which policy, for which task, with what outcome?" for every meaningful tool call.

How does least privilege apply differently to coding agents vs human developers?

Agents execute at machine speed and can chain many actions in seconds, so over-permissioned access compounds far more quickly than it does for humans. Human developers rely on judgment pauses and friction; agents rely on policy gates enforced at the runtime layer. This means least privilege for agents requires tighter operation scoping, shorter credential lifetimes, and stronger runtime authorization checks than typical human access controls—the margin for error is smaller because the execution velocity is higher.

Written by

Or Weis

Co-Founder / CEO at Permit.io

Related Tags

Test in minutes,go to prod in days.

Get Started Now

Join our Community

2938 Members

Get support from our experts, Learn from fellow devs

Join Permit's Slack

Host isolation vs authority isolation

The real credential surface of a coding agent

The credential inventory that matters in practice looks like this:

GitHub App tokens and PATs: Scope is everything. repo:read and metadata are very different from contents:write, pull_requests:write, or org-admin permissions. GitHub agent permissions should be task-scoped and repo-bounded by default.
Package registry credentials (npm, PyPI, crates.io, etc.): This is distribution-plane risk. A compromised publish token can ship malicious artifacts to downstream consumers even if source control remains clean.
Cloud CLI/API credentials (AWS, GCP, Azure): Access keys, service-account credentials, federated sessions, and managed identities all become infrastructure authority when reachable by an agent runtime.
Email/SMTP/Gmail credentials: Email authority is meta-authority. It enables phishing, workflow impersonation, and password-reset interception across unrelated systems.
Browser session cookies and OAuth tokens: Active browser sessions can bypass fresh MFA prompts and effectively hand over already-authenticated state.
CI/CD tokens and pipeline identities: These are operational credentials: run builds, inject artifacts, modify release flow, and deploy.
MCP server connections: MCP server security is now core, not optional. MCP tools can amplify authority by proxying to systems the agent otherwise couldn't reach.
SaaS API keys (Jira, Slack, Notion, etc.): These create organization-wide side effects—ticket churn, notification abuse, data exposure, and social engineering opportunities.

Classifying coding-agent tool calls by risk tier

Tier	Examples	Risk profile	Typical control
Read/List	`git clone`, `git log`, `grep`, `ls`, `cat`	Low; observational; no direct side effects	Auto-allow with logging
Edit/Write	file write, `git commit`, branch push	Medium; reversible but defect-introducing	Policy allow + scope checks
PR/Review	open PR, request reviewers, issue-state changes	Elevated; org/social surface	Conditional allow, stronger identity binding
Merge/Deploy	merge protected branches, trigger CI/CD, deploy envs	High; business impact, potentially irreversible	Human approval for agent actions
Secret/Credential access	read secret manager, write `.env`, rotate keys	Critical; privilege amplification	Explicit approval + JIT grant
Destructive shell	`rm -rf`, `DROP DATABASE`, infra teardown	Critical/irreversible	Default deny or break-glass approval

Zero standing permissions and delegated access

Runtime enforcement with Permit.io

{
  "event_id": "evt_9f3b2c1d",
  "timestamp": "2026-06-12T18:14:22Z",
  "human_principal": {
    "id": "user_12345",
    "email": "dev@example.com",
    "role": "engineer"
  },
  "agent": {
    "id": "agent_claude_code_07",
    "session_id": "sess_a81d",
    "task_id": "task_fix_checkout_timeout"
  },
  "request": {
    "tool": "github.merge_pull_request",
    "operation": "merge",
    "resource": "github://org/payment-service/pull/842",
    "tier": "merge_deploy",
    "parameters": {
      "merge_method": "squash",
      "target_branch": "main"
    }
  },
  "policy_decision": {
    "result": "deny",
    "reason": "human_principal_missing_permission_for_protected_branch",
    "matched_rule": "merge_requires_maintainer_and_change_window",
    "policy_version": "2026-06-12.4"
  },
  "jit_grant": {
    "status": "not_issued",
    "ttl_seconds": 0
  },
  "outcome": "operation_blocked_before_execution"
}

Frequently asked questions

Is sandboxing enough to secure a coding agent?

What credentials should a coding agent be allowed to use?

How does MCP server access change the risk profile?

What are zero standing permissions for coding agents?

How should teams implement human approval for destructive agent actions?

What does a proper audit trail for coding-agent actions look like?

How does least privilege apply differently to coding agents vs human developers?

Written by

Or Weis

Co-Founder / CEO at Permit.io

Test in minutes,go to prod in days.

Join our Community

Get support from our experts, Learn from fellow devs

Host isolation vs authority isolation

The real credential surface of a coding agent

Classifying coding-agent tool calls by risk tier

Zero standing permissions and delegated access

Runtime enforcement with Permit.io

Frequently asked questions

Is sandboxing enough to secure a coding agent?

What credentials should a coding agent be allowed to use?

How does MCP server access change the risk profile?

What are zero standing permissions for coding agents?

How should teams implement human approval for destructive agent actions?

What does a proper audit trail for coding-agent actions look like?

How does least privilege apply differently to coding agents vs human developers?

Written by

Or Weis

Related Tags

More to read

Agent-Generated APIs Need Governance Before They Become Agent-Callable Tools

Can AI Generate Authorization Policy Safely?

MCP Auth Bypasses Show Why Tool Calls Need Runtime Authorization

Test in minutes,go to prod in days.

Join our Community

Get support from our experts, Learn from fellow devs

Host isolation vs authority isolation

The real credential surface of a coding agent

Classifying coding-agent tool calls by risk tier

Zero standing permissions and delegated access

Runtime enforcement with Permit.io

Frequently asked questions

Is sandboxing enough to secure a coding agent?

What credentials should a coding agent be allowed to use?

How does MCP server access change the risk profile?

What are zero standing permissions for coding agents?

How should teams implement human approval for destructive agent actions?

What does a proper audit trail for coding-agent actions look like?

How does least privilege apply differently to coding agents vs human developers?

Written by

Or Weis

Related Tags

More to read

Agent-Generated APIs Need Governance Before They Become Agent-Callable Tools

Can AI Generate Authorization Policy Safely?

MCP Auth Bypasses Show Why Tool Calls Need Runtime Authorization