A team turns on Salesforce-to-HubSpot sync at 4pm Friday with all default mappings and 280,000 records. Monday morning HubSpot has 280,000 contacts, 60,000 of them new duplicates, and a marketing campaign that fires on contact creation just emailed every dormant Salesforce lead from 2019. Data Sync is the right tool when used carefully and a major incident generator when used carelessly. The patterns that prevent the incident are simple and skipped often.
What Data Sync does
Two-way sync between HubSpot and over a hundred external systems — Salesforce, Microsoft Dynamics, Zoho, Outreach, NetSuite, Mailchimp, and more. Part of Operations Hub. Replaces fragile one-off integrations with a managed connector that handles auth, mapping, conflicts, and retry.
Capabilities:
Bi-directional or one-way per object
Field-level mapping with transforms
Filter by property to scope sync
Historical sync of existing records
Real-time delta sync after initial
Conflict resolution rules
Health dashboard with error counts
Map fields explicitly, never auto-map
Auto-map sounds convenient and creates the wrong sync within minutes. Fields named the same on both sides often mean different things. Map field by field with intent:
HubSpot.lifecyclestage <-> Salesforce.LeadStatus (one-way HS -> SF)
HubSpot.firstname <-> Salesforce.FirstName (bi-directional)
HubSpot.email <-> Salesforce.Email (bi-directional, key)
HubSpot.hs_lead_score <- Salesforce.External_Score (one-way SF -> HS)
HubSpot.deal_amount -> Salesforce.Amount (one-way HS -> SF)
Skip fields where the system of record is clear and one side should not write. A lead score computed in Salesforce should not be writable from HubSpot, ever.
Filter scope to prevent surprise
The default historical sync pulls everything. Almost always wrong. Use filters to scope:
Filter contacts where:
Email is known (skip records without email)
Created in last 24 months (skip ancient leads)
Not opted out
Not in suppression list
Run the filtered historical in a sandbox or test portal first. Verify the count matches expectations and a sample of records before pointing at production.
Conflict resolution
When both systems update the same record, default is last-write-wins. That is rarely what you want for fields with a clear system of record:
Field: amount
System of record: Salesforce
Direction: SF -> HS
Conflict: SF wins always
Field: phone
System of record: HubSpot (CRM team owns)
Direction: HS -> SF
Conflict: HS wins always
Field: notes
System of record: shared
Direction: bi-directional
Conflict: most recent edit wins
Document the system of record per field in a sheet shared with both teams.
Disable workflows during initial sync
A historical sync that triggers contact-creation workflows on 100k records will email 100k people. Suspend workflows that trigger on creation, run the sync, verify, and re-enable:
Pre-sync checklist:
- Pause "New contact" workflows
- Pause "Lifecycle stage change" workflows
- Confirm filters reduce scope to expected count
- Confirm field mapping in writing
- Run sync
- Spot-check 50 records across segments
- Re-enable workflows after 24-hour quiet period
Health monitoring
The Data Sync health page shows counts, errors, and field-level issues. Check weekly:
Metrics:
Records synced last 24h
Errors last 24h
Failed records last 7d
Field validation errors
Auth status
Failed records do not block other syncs, but they accumulate silently. Subscribe to the daily health email and route to a triage Slack channel.
Identity resolution and dedup
Sync uses email as the default identity key for contacts. Records without email match by external id. Run a dedup pass before sync to avoid creating mirrored duplicates on both sides — fixing duplicates after a sync is significantly harder than fixing them before.
What to do this week
Document system of record per critical field, scope your next sync with filters, pause creation workflows during the initial run, and configure health alerts before flipping the switch on any new connector.