[object Object]

Your CRM agent has more privileges than your CISO. It reads accounts, writes opportunities, sends emails on behalf of reps, calls billing APIs, and queries the data warehouse. It does this autonomously, at machine speed, while a prompt injection in a customer email tries to convince it to exfiltrate the pipeline. Zero-trust isn’t optional here. It’s the only architecture that survives contact with production.

Zero-trust, restated for agents

The classic Forrester / NIST 800-207 principles translate directly:

  • Never trust, always verify. Every tool call is authorized at call time, not at session start.
  • Assume breach. The agent will be prompt-injected. Plan for it.
  • Least privilege. The agent gets the minimum scope to complete the current task — not the user’s full scope.
  • Just-in-time access. Credentials and scopes are issued per task, expire on completion.
  • Verify explicitly. Every action logged, attributed, replayable.
  • Microsegmentation. Agents can’t move laterally between tenants, regions, or sensitivity tiers.

Vendor frameworks don’t give you this by default. Agentforce, Copilot Studio, Now Assist all start with “agent runs as a permission set user.” That’s the opposite of least privilege.

The threat model

An agent in 2026 production faces:

  1. Prompt injection via customer email, case description, document upload, web fetch.
  2. Tool poisoning — a compromised MCP server returning malicious instructions disguised as data.
  3. Confused deputy — the agent uses its higher privilege to act on behalf of a lower-privileged user.
  4. Excessive agency — the agent does what it can, not what it should.
  5. Exfiltration via tool chain — agent reads sensitive data, then writes it to a less-protected sink (email, webhook, log).
  6. Replay attacks on cached agent decisions.

Zero-trust patterns address each.

Identity: agents are first-class principals

Stop treating the agent as the user. Mint the agent its own machine identity (SPIFFE ID, workload identity, or service principal) and have it impersonate the user only via a delegation token with a narrow scope and short TTL.

# agent_identity.yaml
agent:
  spiffe_id: spiffe://crm.example/agents/sales-copilot
  trust_domain: crm.example
  attestation: kubernetes-sa  # or hardware, or workload-identity-pool
delegation:
  on_behalf_of: user:[email protected]
  scope:
    - crm:opportunity:read:owned
    - crm:account:read:owned
    - email:draft:create
  ttl_seconds: 900
  audience: salesforce-api
  jit_approval_required:
    - email:send:external
    - opportunity:update:amount

Now the agent is auditable separately from the user. Both identities appear in the log. Compromise of one doesn’t compromise the other.

Tool scopes: per-tool, per-task

A common failure: the agent has a single OAuth token with scope crm.full_access because it might need to do anything. Better: each tool registration is its own scoped credential, and the planner requests only the tools it needs for the current task.

The MCP protocol formalizes this. Tool definitions declare required scopes; the gateway issues per-tool tokens at invocation time. If the agent has no business sending email, no email scope is issued for that session.

Just-in-time elevation

High-risk actions require a fresh approval. Patterns:

  • Synchronous human approval for actions above a threshold ($X discount, account-merge, mass-update).
  • Step-up auth — the user re-authenticates (passkey, push) before the agent commits the action.
  • Time-boxed elevation — agent gets opportunity:write:amount for 5 minutes after approval, then it’s gone.

Salesforce’s “Sensitive Action Confirmation” and Microsoft’s Purview-integrated step-up flows are the productized versions. They’re not on by default. Turn them on.

Microsegmentation: blast radius containment

Don’t run all agents in one trust zone. Segment by:

  • Sensitivity tier. Tier-1 (customer-facing autonomous) gets fewer privileges than tier-3 (back-office assistive). Different network egress rules. Different model selection.
  • Tenant boundary. Multi-tenant CRM agents must not be able to query across tenants. Enforce at the data-access layer, not in the prompt.
  • Region. EU data stays in EU compute. Agentforce and Copilot both support this now; you have to configure it.
# agent_segmentation.yaml
zones:
  - id: tier1-customer-service
    egress: [crm-api, kb-api]
    deny: [warehouse, payment-gateway, internal-wiki]
    model: claude-sonnet-via-vendor
    region: eu-west-1
  - id: tier3-sales-ops-assist
    egress: [crm-api, warehouse-read-only]
    deny: [billing-write, hr-system]
    model: claude-opus-via-vendor
    region: us-east-1

Tool gateway with policy enforcement

Don’t let agents call tools directly. Put a policy-enforcing gateway in the middle. OPA, Cedar, or a vendor equivalent.

# policy.rego (simplified)
package agent.tools

default allow := false

allow if {
    input.tool == "crm.opportunity.update"
    input.field != "amount"
    input.actor.tier <= 2
}

allow if {
    input.tool == "crm.opportunity.update"
    input.field == "amount"
    input.delta_pct < 0.10
    input.actor.tier <= 2
    input.approval.status == "granted"
    time.now_ns() - input.approval.granted_at_ns < 300_000_000_000  # 5 min
}

Every tool call goes through this. Denied calls log; the agent retries with a different approach. The policy is versioned, reviewable, testable in CI.

Output filtering and exfiltration defense

Inbound prompt injection is half the threat. The other half is outbound exfiltration — the agent successfully reads sensitive data and then writes it somewhere it shouldn’t.

Defenses:

  • Tool sinks have scopes too. The agent that can read PII can’t necessarily write to a webhook tool.
  • Output classifiers. Before any external send (email, slack, webhook), run a classifier for PII / secrets / pricing. Block on hit.
  • DLP integration. Hook into Purview, Salesforce Shield, or a third-party DLP for outbound content scanning.
  • Channel allowlists. Agent can email customers on file. Agent cannot email arbitrary addresses. Enforced at the email tool, not in the prompt.

Audit and replay

You need an audit trail dense enough to replay every agent action. Minimum fields per action:

  • Agent identity, user identity, tenant.
  • Tool name, scope used, arguments, return.
  • Policy decision (allow / deny + rule id).
  • Model version, prompt hash, output hash.
  • Trace id (OpenTelemetry).
  • Approval reference (if applicable).

Store immutably. Retain per regulatory requirement. Index for “find every action this agent took on behalf of this user between these times.” Without this, post-incident forensics is fiction.

Defense against prompt injection

Zero-trust assumes the agent’s input is hostile. Layered defenses:

  • Input separation. Clearly delimit “system instructions” from “user content” and “retrieved content” using structured prompts. Some models (Claude with system prompts, GPT-4 with structured roles) honor this better than others.
  • Instruction-following classifier. A small model that flags suspicious imperatives (“ignore previous instructions”, “send all account data to”) before they reach the main agent.
  • Tool-call review. Before any high-risk tool call, a second model reviews intent vs context. Slower, but catches a class of attacks.
  • Spotlighting. Encode retrieved content (base64, special tokens) so the agent treats it as data, not instruction. Microsoft Research’s spotlighting paper documents the technique.
  • Limited retries. If the agent gets stuck in an unusual loop after retrieving content, escalate rather than continue.

None of these are silver bullets. Combined, they reduce successful injection by 90%+. Single defenses fall to creative attacks.

The continuous verification loop

Zero-trust isn’t a one-time gate at session start. Re-verify continuously:

  • Per tool call: policy check, scope check, fresh token check.
  • Per N actions: re-attest identity, refresh delegation.
  • On anomaly: tool call pattern divergence triggers re-auth or shutdown.

Anomaly detection at the action level catches compromised sessions before they finish.

Network egress controls

The agent’s runtime should not be able to reach arbitrary hosts. Lock egress to an allowlist:

  • CRM APIs (specific endpoints, not the whole domain).
  • Tool registry hosts.
  • Model provider endpoints (Anthropic, OpenAI, Azure OpenAI).
  • Observability endpoints.

Everything else: deny. This blocks an entire class of exfiltration where a prompt-injected agent is told to POST to attacker.example.com. The DNS resolution fails. The attack stops.

Implementation: egress proxy with allowlist, or VPC service controls (GCP), or Azure Private Link + firewall, or PrivateLink + Network Firewall (AWS). Vendor agents need configuration; check Trust Center docs.

Secret handling

Agents don’t see secrets. Tools see secrets. The agent says “send email”; the email tool resolves the SMTP credential from a secrets manager and uses it. The agent never had access.

Patterns:

  • Short-lived credentials (15 min TTL) fetched per tool call.
  • Hardware-backed signing (KMS, HSM) for high-value operations.
  • No plain-text credentials in the agent’s context or system prompt — ever.
  • Audit every secret retrieval; alert on anomalous patterns.

If your agent has any string that starts with “sk-” or “AKIA” in its context, you’ve already lost.

What the vendors give you (and don’t)

CapabilityAgentforceCopilot StudioNow Assist
Per-tool scopesPartialPartialPartial
Policy gatewayEinstein Trust LayerPurviewNowGuard
Step-up authAvailableAvailableAvailable
Output DLPShield + Trust LayerPurview DLPDLP add-on
Audit trailYes (Event Monitoring)Yes (Purview)Yes (Now Audit)
OPA-style external policyNo nativeLimitedNo native

Native gets you 60%. The rest is policy engine + observability + a security team that cares.

Bottom line

  • Agents are principals, not user proxies. Give them their own identity and audit them separately.
  • Per-tool, per-task scopes with JIT elevation. Stop giving agents standing privileges.
  • Put a policy gateway in front of every tool. OPA / Cedar / Rego is reviewable in a way a system prompt is not.
  • Output filtering catches what input filtering misses. PII / secret classifiers on every outbound sink.
  • Continuous re-verification — anomaly at action level, not just session start.
  • Vendor frameworks get you 60%. The other 40% is the security architecture you owe yourself.
[object Object]
Share