Your CRM agent has more privileges than your CISO. It reads accounts, writes opportunities, sends emails on behalf of reps, calls billing APIs, and queries the data warehouse. It does this autonomously, at machine speed, while a prompt injection in a customer email tries to convince it to exfiltrate the pipeline. Zero-trust isn’t optional here. It’s the only architecture that survives contact with production.
Zero-trust, restated for agents
The classic Forrester / NIST 800-207 principles translate directly:
- Never trust, always verify. Every tool call is authorized at call time, not at session start.
- Assume breach. The agent will be prompt-injected. Plan for it.
- Least privilege. The agent gets the minimum scope to complete the current task — not the user’s full scope.
- Just-in-time access. Credentials and scopes are issued per task, expire on completion.
- Verify explicitly. Every action logged, attributed, replayable.
- Microsegmentation. Agents can’t move laterally between tenants, regions, or sensitivity tiers.
Vendor frameworks don’t give you this by default. Agentforce, Copilot Studio, Now Assist all start with “agent runs as a permission set user.” That’s the opposite of least privilege.
The threat model
An agent in 2026 production faces:
- Prompt injection via customer email, case description, document upload, web fetch.
- Tool poisoning — a compromised MCP server returning malicious instructions disguised as data.
- Confused deputy — the agent uses its higher privilege to act on behalf of a lower-privileged user.
- Excessive agency — the agent does what it can, not what it should.
- Exfiltration via tool chain — agent reads sensitive data, then writes it to a less-protected sink (email, webhook, log).
- Replay attacks on cached agent decisions.
Zero-trust patterns address each.
Identity: agents are first-class principals
Stop treating the agent as the user. Mint the agent its own machine identity (SPIFFE ID, workload identity, or service principal) and have it impersonate the user only via a delegation token with a narrow scope and short TTL.
# agent_identity.yaml
agent:
spiffe_id: spiffe://crm.example/agents/sales-copilot
trust_domain: crm.example
attestation: kubernetes-sa # or hardware, or workload-identity-pool
delegation:
on_behalf_of: user:[email protected]
scope:
- crm:opportunity:read:owned
- crm:account:read:owned
- email:draft:create
ttl_seconds: 900
audience: salesforce-api
jit_approval_required:
- email:send:external
- opportunity:update:amount
Now the agent is auditable separately from the user. Both identities appear in the log. Compromise of one doesn’t compromise the other.
Tool scopes: per-tool, per-task
A common failure: the agent has a single OAuth token with scope crm.full_access because it might need to do anything. Better: each tool registration is its own scoped credential, and the planner requests only the tools it needs for the current task.
The MCP protocol formalizes this. Tool definitions declare required scopes; the gateway issues per-tool tokens at invocation time. If the agent has no business sending email, no email scope is issued for that session.
Just-in-time elevation
High-risk actions require a fresh approval. Patterns:
- Synchronous human approval for actions above a threshold ($X discount, account-merge, mass-update).
- Step-up auth — the user re-authenticates (passkey, push) before the agent commits the action.
- Time-boxed elevation — agent gets
opportunity:write:amountfor 5 minutes after approval, then it’s gone.
Salesforce’s “Sensitive Action Confirmation” and Microsoft’s Purview-integrated step-up flows are the productized versions. They’re not on by default. Turn them on.
Microsegmentation: blast radius containment
Don’t run all agents in one trust zone. Segment by:
- Sensitivity tier. Tier-1 (customer-facing autonomous) gets fewer privileges than tier-3 (back-office assistive). Different network egress rules. Different model selection.
- Tenant boundary. Multi-tenant CRM agents must not be able to query across tenants. Enforce at the data-access layer, not in the prompt.
- Region. EU data stays in EU compute. Agentforce and Copilot both support this now; you have to configure it.
# agent_segmentation.yaml
zones:
- id: tier1-customer-service
egress: [crm-api, kb-api]
deny: [warehouse, payment-gateway, internal-wiki]
model: claude-sonnet-via-vendor
region: eu-west-1
- id: tier3-sales-ops-assist
egress: [crm-api, warehouse-read-only]
deny: [billing-write, hr-system]
model: claude-opus-via-vendor
region: us-east-1
Tool gateway with policy enforcement
Don’t let agents call tools directly. Put a policy-enforcing gateway in the middle. OPA, Cedar, or a vendor equivalent.
# policy.rego (simplified)
package agent.tools
default allow := false
allow if {
input.tool == "crm.opportunity.update"
input.field != "amount"
input.actor.tier <= 2
}
allow if {
input.tool == "crm.opportunity.update"
input.field == "amount"
input.delta_pct < 0.10
input.actor.tier <= 2
input.approval.status == "granted"
time.now_ns() - input.approval.granted_at_ns < 300_000_000_000 # 5 min
}
Every tool call goes through this. Denied calls log; the agent retries with a different approach. The policy is versioned, reviewable, testable in CI.
Output filtering and exfiltration defense
Inbound prompt injection is half the threat. The other half is outbound exfiltration — the agent successfully reads sensitive data and then writes it somewhere it shouldn’t.
Defenses:
- Tool sinks have scopes too. The agent that can read PII can’t necessarily write to a webhook tool.
- Output classifiers. Before any external send (email, slack, webhook), run a classifier for PII / secrets / pricing. Block on hit.
- DLP integration. Hook into Purview, Salesforce Shield, or a third-party DLP for outbound content scanning.
- Channel allowlists. Agent can email customers on file. Agent cannot email arbitrary addresses. Enforced at the email tool, not in the prompt.
Audit and replay
You need an audit trail dense enough to replay every agent action. Minimum fields per action:
- Agent identity, user identity, tenant.
- Tool name, scope used, arguments, return.
- Policy decision (allow / deny + rule id).
- Model version, prompt hash, output hash.
- Trace id (OpenTelemetry).
- Approval reference (if applicable).
Store immutably. Retain per regulatory requirement. Index for “find every action this agent took on behalf of this user between these times.” Without this, post-incident forensics is fiction.
Defense against prompt injection
Zero-trust assumes the agent’s input is hostile. Layered defenses:
- Input separation. Clearly delimit “system instructions” from “user content” and “retrieved content” using structured prompts. Some models (Claude with system prompts, GPT-4 with structured roles) honor this better than others.
- Instruction-following classifier. A small model that flags suspicious imperatives (“ignore previous instructions”, “send all account data to”) before they reach the main agent.
- Tool-call review. Before any high-risk tool call, a second model reviews intent vs context. Slower, but catches a class of attacks.
- Spotlighting. Encode retrieved content (base64, special tokens) so the agent treats it as data, not instruction. Microsoft Research’s spotlighting paper documents the technique.
- Limited retries. If the agent gets stuck in an unusual loop after retrieving content, escalate rather than continue.
None of these are silver bullets. Combined, they reduce successful injection by 90%+. Single defenses fall to creative attacks.
The continuous verification loop
Zero-trust isn’t a one-time gate at session start. Re-verify continuously:
- Per tool call: policy check, scope check, fresh token check.
- Per N actions: re-attest identity, refresh delegation.
- On anomaly: tool call pattern divergence triggers re-auth or shutdown.
Anomaly detection at the action level catches compromised sessions before they finish.
Network egress controls
The agent’s runtime should not be able to reach arbitrary hosts. Lock egress to an allowlist:
- CRM APIs (specific endpoints, not the whole domain).
- Tool registry hosts.
- Model provider endpoints (Anthropic, OpenAI, Azure OpenAI).
- Observability endpoints.
Everything else: deny. This blocks an entire class of exfiltration where a prompt-injected agent is told to POST to attacker.example.com. The DNS resolution fails. The attack stops.
Implementation: egress proxy with allowlist, or VPC service controls (GCP), or Azure Private Link + firewall, or PrivateLink + Network Firewall (AWS). Vendor agents need configuration; check Trust Center docs.
Secret handling
Agents don’t see secrets. Tools see secrets. The agent says “send email”; the email tool resolves the SMTP credential from a secrets manager and uses it. The agent never had access.
Patterns:
- Short-lived credentials (15 min TTL) fetched per tool call.
- Hardware-backed signing (KMS, HSM) for high-value operations.
- No plain-text credentials in the agent’s context or system prompt — ever.
- Audit every secret retrieval; alert on anomalous patterns.
If your agent has any string that starts with “sk-” or “AKIA” in its context, you’ve already lost.
What the vendors give you (and don’t)
| Capability | Agentforce | Copilot Studio | Now Assist |
|---|---|---|---|
| Per-tool scopes | Partial | Partial | Partial |
| Policy gateway | Einstein Trust Layer | Purview | NowGuard |
| Step-up auth | Available | Available | Available |
| Output DLP | Shield + Trust Layer | Purview DLP | DLP add-on |
| Audit trail | Yes (Event Monitoring) | Yes (Purview) | Yes (Now Audit) |
| OPA-style external policy | No native | Limited | No native |
Native gets you 60%. The rest is policy engine + observability + a security team that cares.
Bottom line
- Agents are principals, not user proxies. Give them their own identity and audit them separately.
- Per-tool, per-task scopes with JIT elevation. Stop giving agents standing privileges.
- Put a policy gateway in front of every tool. OPA / Cedar / Rego is reviewable in a way a system prompt is not.
- Output filtering catches what input filtering misses. PII / secret classifiers on every outbound sink.
- Continuous re-verification — anomaly at action level, not just session start.
- Vendor frameworks get you 60%. The other 40% is the security architecture you owe yourself.