[object Object]

A new VP of Marketing pulls a list of “active customers” and the count is 12 percent higher than finance reported. The discrepancy traces to duplicates, lifecycle stage drift, and contacts marked active that have not opened an email in two years. None of it is malicious; all of it is a data quality program nobody owned. HubSpot makes maintenance easier than it used to, but the cleanup discipline still has to come from a human.

Duplicates and the merge cadence

HubSpot surfaces likely duplicates in Settings > Data quality > Duplicate management. Property matches on email and phone catch most cases. Review at least weekly. Ops Hub adds automated dedup rules for high-confidence matches:

Rule: Merge contacts where
  email_lowercase matches AND
  created_within 30 days
Action: Auto-merge, keep oldest record

Resist the temptation to auto-merge on phone alone — partner reps often share phone numbers and you will collapse distinct people.

Property standardization

Inconsistent state names (“CA,” “California,” “Calif.”) break grouped reports and routing rules. Phone formats (“+1 415-555-0100” vs “415.555.0100”) break dialer integrations. Company name variants (“Acme Inc,” “Acme, Inc.,” “ACME”) prevent rollups. Ops Hub format automations standardize these on write:

Workflow: Standardize on contact create or update
  - Lowercase email
  - Format phone E.164
  - Title-case first/last name
  - Look up state from postal code
  - Trim whitespace on company name

Dead contact suppression

Contacts that have not opened or clicked in 18 months cost you deliverability when you keep emailing them. Tag them with a suppression flag, exclude from active campaigns, and set a hard-delete cadence aligned with your retention policy and any GDPR/CCPA obligations:

Active list: ENG_dead_18mo
  Filter: Last engagement > 18 months ago
  Filter: Not opted out (already excluded)

Workflow: Tag suppression
  Trigger: Member of ENG_dead_18mo
  Action: Set marketing_suppression = true

Quarterly job: Hard-delete contacts where
  marketing_suppression = true AND
  has_open_deal = false AND
  retention_window_passed = true

Email validation upstream and downstream

Catch bad addresses at form submit with real-time validation (HubSpot forms support a setting; integrate with a validator like Kickbox for higher accuracy). For existing data, run quarterly sweeps to flag invalid addresses and remove them from sends:

// Sweep with validator API
for (const batch of chunk(contacts, 500)) {
  const results = await validator.bulk(batch.map(c => c.email));
  await hsClient.crm.contacts.batchApi.update({
    inputs: results.filter(r => r.status === "undeliverable").map(r => ({
      id: r.contactId,
      properties: { email_status: "undeliverable" }
    }))
  });
}

Required field enforcement

A contact missing lifecycle stage breaks lifecycle reporting. Use validation on critical properties at create time and a workflow that flags violations for human review rather than silently dropping records.

Data quality dashboard

What is not measured does not improve. Build one dashboard with:

- Duplicate candidates open
- Contacts missing lifecycle stage
- Contacts missing original source
- Companies with no associated contacts
- Deals with no associated contacts
- Contacts with malformed phone
- Marketing-eligible contacts without consent record

Publish weekly. Assign owners per metric. Improvement appears within a quarter when ownership is clear.

What to do this week

Run the duplicate manager, build the data quality dashboard above, and assign one named owner per metric before the end of the week.

[object Object]
Share