The CMDB is supposed to answer “what depends on what”. After two years of discovery, manual edits, and three integrations all writing relationships their own way, it answers “kind of, sometimes, ask a human”. Cleaning the asset relationship graph in Freshservice is unglamorous, unprioritised, and the single change that buys the most operational leverage per hour invested.
The four diseases of a CMDB graph
Almost every neglected graph has the same four problems:
- Stale edges — relationships to assets that no longer exist, or that have not been seen by discovery in 180+ days.
- Phantom nodes — assets that exist in Freshservice with no recent discovery signal, no owner, no last login, nothing.
- Circular dependencies — A depends on B depends on C depends on A. Often a discovery quirk.
- Sparse coverage — production services with no upstream dependencies at all. Either they truly have none (rare) or nobody bothered to record them (common).
The cleanup protocol attacks them in this order because each step uses information the previous step revealed.
Step one: stale-edge sweep
Pull every relationship and join against last-seen timestamps from your discovery probe. Any edge where either endpoint has not been seen by discovery in 180 days is suspect.
curl -s -u "$FS_KEY:X" \
"https://acme.freshservice.com/api/v2/cmdb/relationships?per_page=100" \
| jq '[.relationships[]
| {id, from: .primary_id, to: .secondary_id, type: .relationship_type_id}]'
Cross-reference each endpoint with the asset’s last_audit_date. Anything dual-stale is gone — delete the relationship outright. Single-stale needs a human pass to decide.
A useful intermediate state: tag stale relationships with a _review flag rather than deleting immediately. Change Advisory Board members can scan the review list before the next CAB meeting and confirm. See freshservice-change-advisory-board for how to integrate this into existing CAB cadence.
Step two: phantom node audit
A phantom node is an asset with no useful telemetry. Detection:
- No discovery event in 180 days.
- No owner assigned.
- No ticket activity (impact, attachment, watcher) in 365 days.
- No active relationships.
Anything matching all four is dead inventory. Archive it. Do not delete in the first pass — set a custom field archived_for_cleanup_2026_05 so you can resurrect if a forgotten dependency surfaces. After 90 days of no resurrection, hard delete.
async function findPhantoms() {
const assets = await fetchAllAssets();
return assets.filter(a => {
const lastDiscovery = daysSince(a.last_audit_date);
const lastTicket = daysSince(a.last_ticket_at);
return lastDiscovery > 180
&& !a.user_id
&& lastTicket > 365
&& a.relationship_count === 0;
});
}
Run this with a dry-run flag first. Sample 50 results. If more than two are real, recalibrate the thresholds — the CMDB is sicker than you thought and you need to lengthen the windows.
Step three: circular dependency detection
Cycles in the graph break impact analysis. If A depends on B depends on A, “what is impacted if A goes down” returns infinite recursion or, more commonly, an arbitrary truncation that hides real impact.
Walk the graph with a depth-first search and flag any node visited twice in a single path:
function detectCycles(adjacency) {
const cycles = [];
for (const start of Object.keys(adjacency)) {
const stack = [[start, [start]]];
while (stack.length) {
const [node, path] = stack.pop();
for (const next of adjacency[node] || []) {
if (path.includes(next)) {
cycles.push([...path, next]);
continue;
}
if (path.length < 10) stack.push([next, [...path, next]]);
}
}
}
return cycles;
}
Most cycles are introduced by discovery rules that record both hosts and hosted-on between the same pair, or by humans clicking the inverse relationship without realising it was already there. Resolution rule: keep the directional edge that matches semantic reality (a VM is hosted on a hypervisor, not the other way), delete the inverse.
Step four: sparse coverage backfill
The hardest of the four. Production services with zero recorded dependencies are usually under-documented, not actually standalone. The fix is not technical, it is conversational.
Identify candidates:
- Services tagged
productionor inbusiness_criticalcategory. - Zero outgoing relationships or only
depends_on_selfstyle noise.
Send each owner a five-question form: what does this service need to function, what does it expose, what runs on the same hardware, what monitors it, what backs it up. Their answers map directly to relationship types.
This step is the slowest. Budget two weeks of part-time owner outreach for a 500-service catalogue. Pair the outreach with a service-catalogue review — the same conversation tends to surface stale service entries too.
See freshservice-cmdb-relationships for the canonical relationship-type vocabulary so the backfill matches existing taxonomy.
Discovery probe hygiene
Most graph rot starts at the probe. If the probe is misconfigured — running with stale credentials, missing a subnet, classifying by hostname pattern — every cleanup pass is undone in a week.
Three things to fix on the probe before doing any graph work:
- Probe credentials rotated and validated against every target class (Windows, Linux, network device, hypervisor).
- Subnet coverage matches network team’s authoritative list.
- Classification rules recently reviewed — class drift over years creates duplicate asset categories that fragment the graph.
See freshservice-probe-agent-setup for the probe-side configuration that makes the cleanup stick.
Cleanup cadence
Once the initial scrub is done, the maintenance cadence is small and continuous:
- Weekly: cron job emails the count of stale edges, phantom candidates, and new cycles. Spike means investigate.
- Monthly: 30-minute CAB agenda item to confirm flagged-for-archive assets.
- Quarterly: re-run the full audit and compare deltas. Trend lines should be flat or improving.
The CMDB never finishes being cleaned. The goal is keeping the cumulative rot under the rate of repair.
What changes downstream
A clean graph fixes things you did not realise were broken:
- Change risk assessment reflects actual blast radius.
- Major incident triage finds the real upstream cause faster.
- Asset lifecycle reporting (cost, age, license utilisation) matches reality.
- Onboarding for new SRE hires goes from “ask Bob” to “open the dependency view”.
Each of these is worth the effort on its own. Combined, they justify the weekend.
Bottom line
CMDB cleanup is a four-step protocol: stale edges, phantom nodes, cycles, sparse coverage. Fix the discovery probe first or the cleanup will not stick. Run a small weekly check and a monthly CAB review to keep it clean. The graph that nobody trusts becomes the graph that everyone consults — and that is the asset, not the spreadsheet next to it.