What Bedrock Provides
AWS-managed access to a roster of foundation models: Anthropic Claude (Opus 4.5, Sonnet 4.5, Haiku 4.5), Meta Llama 4 family, Mistral Large 3, Cohere Command R+, Amazon Nova (Pro, Lite, Micro), Stability SD3, and AI21 Jamba. As of 2026, AWS lists 30+ models across providers. Bedrock surfaces them through a uniform InvokeModel / Converse API with shared identity, observability, billing, and guardrails.
Capabilities that matter for CRM:
- VPC integration via VPC endpoints — model traffic never traverses the public internet.
- IAM-based auth — fine-grained per-resource policies.
- Cross-region inference for latency and capacity.
- Bedrock Knowledge Bases (managed RAG over OpenSearch, Aurora pgvector, Pinecone, or S3 Vectors).
- Bedrock Agents for orchestrated tool-using agents with action groups.
- Bedrock Guardrails — content moderation, denied topics, contextual grounding, PII redaction.
- Provisioned Throughput for guaranteed capacity.
- Model Distillation and Custom Model Import for fine-tuned variants.
CRM Integration Patterns
- Salesforce BYO LLM: Einstein Trust Layer can route inference to Bedrock endpoints. Customer data stays inside the AWS account; Salesforce orchestrates the prompt with masking and audit.
- Custom agents inside an AWS-resident architecture: Lambda or ECS calls Bedrock; results write back to Salesforce, ServiceNow, or HubSpot via their respective APIs or MCP servers.
- HubSpot enterprise integration: HubSpot’s Workflow Custom Code (Node.js) can invoke Bedrock through AWS SDK; useful when ZDR or AWS contractual posture is required.
- ServiceNow Now Assist BYO LLM: configurable to route to Bedrock for organizations on AWS-aligned procurement.
- Knowledge ingestion: Bedrock Knowledge Bases ingest from Salesforce Knowledge, ServiceNow KB, Zendesk Help Center via connectors and S3 sync; serve as the grounding layer for any of the above.
When Bedrock Wins
- Existing AWS commitments (EDP, Reserved Instance balances, savings plans) — Bedrock spend counts toward commit.
- Regulated workloads where AWS data-residency, HIPAA-eligible configuration, GovCloud (FedRAMP High), and signed BAA matter.
- Architectures already running in AWS — same region, low latency, no cross-cloud egress fees.
- Multi-model strategies — switch models behind a uniform API without rewiring auth and observability.
- Sovereign cloud or air-gapped deployments via AWS Outposts and Local Zones.
- Procurement that prefers a single hyperscaler relationship over a stable of model-vendor contracts.
When Direct API Wins
- Cross-cloud or on-prem architectures where AWS isn’t the obvious home.
- Need for bleeding-edge model access — frontier vendors (OpenAI, Anthropic, Google, xAI) often ship new versions on their own APIs first; Bedrock typically follows by days to weeks.
- Simpler pricing and onboarding for low-volume evaluations and prototypes.
- Provider-specific features not exposed through Bedrock (e.g., specific Anthropic beta features, certain caching modes, very-long context flavors).
Cost Considerations
Bedrock pricing matches or near-matches direct vendor API list price for most models, with some variance. Provisioned Throughput offers volume economics but locks capacity for terms. Pay attention to:
- Data transfer charges within and across regions.
- Knowledge Base storage and OpenSearch costs (often the largest line item in a RAG-heavy deployment).
- Guardrails per-policy unit pricing.
- Custom Model Import storage fees.
Total cost is usually within 10–15% of direct API for equivalent workloads — the procurement and architectural reasons drive choice more than per-token price.
Common Failure Modes
- Picking Bedrock for “AWS discount” without modeling actual unit cost — sometimes direct API plus a separate VPC peering is cheaper.
- Not configuring Bedrock Guardrails — defaulting to “no safety net” while the model vendor’s own guardrails sit unused.
- Cross-region without need — regional inference endpoints are usually faster and cheaper.
- Knowledge Bases without document hygiene — indexing 50K stale knowledge articles produces a stale agent.
What to Do This Week
If you’re evaluating Bedrock vs direct API, build the same agent twice on the same model. Compare latency, cost per interaction, and operational toil over 30 days. The right answer is usually obvious by week three.