The Architecture
The warehouse holds the source of truth. The CDP or Data 360 federates and resolves identity. The CRM surfaces activations. Analytics happen in the warehouse. Reverse-ETL tools push curated records back to operational systems. One data layer, many consumers, one set of definitions. The 2026 reference architecture has stabilized around this shape because the alternative — every operational tool maintaining its own customer record — produced the data divergence problems that cost so many enterprises so much trust during 2022-2024. Salesforce Data Cloud’s zero-copy federation with Snowflake (GA 2025) and Databricks Iceberg (GA 2026) is the most visible vendor reaction to this pattern.
Vendor Choices
Snowflake leads on ease of use, ecosystem breadth, and SQL performance for the analytic workloads most CRM teams run; pricing is per-credit consumption with reserved-capacity discounts for committed volume. Databricks leads on unified data + ML, Spark for heavy ETL, and the open-table-format direction (Delta, Iceberg) that the industry is converging on; the Databricks IPO chatter through 2025 made enterprise architects revisit lock-in. BigQuery offers serverless economics and the deepest Google Cloud integration; it dominates in shops already on Workspace and Vertex AI. Redshift remains relevant for AWS-anchored stacks though it has lost share to Snowflake on AWS. Consolidation around one of these is the modern pattern; running two creates a re-platforming bill nobody wants.
Modeling
Medallion architecture — bronze (raw), silver (cleaned, conformed), gold (business-ready) — is the default. Customer dimensional models in gold (dim_customer, dim_account, fct_orders, fct_engagements) become the activation source. Invest in modeling before activation; bad models make every downstream job painful and produce the metric-disagreement problems that destroy trust in dashboards. dbt Cloud is the dominant transformation framework as of 2026 with SQLMesh as the credible alternative; both support contracts, tests, and lineage. The discipline that separates good from poor: a single owner per gold-layer table, an SLA published, a test suite that runs on every commit.
warehouse layout
bronze.salesforce.account_raw (replicated as-is)
bronze.product.event_raw
silver.customer.account_clean (deduped, typed)
silver.customer.event_normalized
gold.dim_customer (one row per resolved customer)
gold.fct_pipeline (forecast-ready)
gold.fct_engagement (cross-channel, 13-month rolling)
CDP Integration
The warehouse feeds the CDP or Data 360. The CDP resolves identity across systems (web visitor, mobile user, email subscriber, call-center caller all become one), activates segments, and powers real-time personalization. Avoid duplicating warehouse logic in the CDP — keep CDP focused on identity resolution and activation, and keep transformation in the warehouse. The teams that violated this principle in 2024 ended up with two competing customer 360s; reconciling them was the painful work of 2025.
What Changed in 2026
Three shifts: zero-copy federation matured (Snowflake-Salesforce, Databricks-Salesforce, BigQuery to multiple destinations), Iceberg became the de facto open table format (Databricks acquired Tabular in 2024 to consolidate the standard), and AI agent grounding emerged as a major warehouse workload — a single Agentforce conversation might issue 8-15 grounded queries against the warehouse, multiplying compute cost compared to traditional analytics.
Cost Considerations
Snowflake credits scale with compute and storage; AI grounding can move credit consumption 30-100% versus baseline analytics. Databricks DBUs vary by cluster type; the photon engine matters for SQL workloads. BigQuery flat-rate (Editions) versus on-demand pricing decisions get harder as agent queries become unpredictable. Budget AI grounding cost as a separate line; instrument query attribution so a single agent flow does not anonymously consume the budget.
Common Failure Modes
The recurring failures: three warehouses for “different reasons” that all hold customer data, gold tables without owners, no contracts between bronze and silver layers, and CDP duplicating warehouse logic. The most expensive: discovering that AI grounding is the largest line on the warehouse bill the month after rollout.
What to do this week
Walk one CRM-driven decision back to its source data through the warehouse. Note every transformation, every join, every owner change. The walk will surface the brittlest dependency — that is your next sprint.