Dual-write is the synchronous bridge between Finance and Operations and Dataverse. The word “synchronous” is misleading — under sustained write load, the queue accumulates and you discover three hours later that customer records updated in F&O are not yet in Dataverse, and sales reps are quoting against stale credit limits. The platform does not page you. You build that yourself.
What dual-write actually is
Each dual-write map is a paired set of mappings between an F&O table and a Dataverse table. Writes on either side hit a sync engine that propagates the change. The engine has a queue per direction. Latency under nominal load is a couple of seconds. Under burst load — month-end invoice posting, a bulk import — the queue grows.
The four lag signals
- Initial sync running: a map is still in its first hydration. New writes do not propagate cleanly until it finishes.
- Queue depth: pending changes per direction per map. Normal is < 100. Concerning is > 5000.
- Error count: row-level failures sitting in the error queue. Each one blocks downstream rows on the same key.
- End-to-end latency: a sample row written on one side, measured arriving on the other.
The Power Platform admin center surfaces the first three at a glance and lies about latency. To know real end-to-end latency, you write your own probe.
The probe pattern
Create a paired probe table on both sides — cdm_dwprobe in F&O and cdm_dwprobes in Dataverse — with three columns: probe_id (string), written_at_source (datetime), written_at_sink (datetime). A scheduled Azure Function writes a row to the F&O side every 60 seconds with written_at_source = now. A Dataverse plugin on Create stamps written_at_sink = now when the row arrives. A second scheduled job reads the latest probe and computes lag = written_at_sink - written_at_source.
// Azure Function: write probe
import { app, Timer } from '@azure/functions';
app.timer('dualwriteProbe', {
schedule: '0 */1 * * * *',
handler: async () => {
const probe = {
probe_id: crypto.randomUUID(),
written_at_source: new Date().toISOString()
};
await fetch(`${FO_BASE}/data/CdmDualWriteProbes`, {
method: 'POST',
headers: {
Authorization: `Bearer ${await getToken()}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(probe)
});
}
});
// Dataverse plugin: stamp arrival
export class StampProbeArrival implements IPlugin {
execute(ctx: IPluginExecutionContext) {
const target = ctx.inputParameters.Target as Entity;
if (target.LogicalName !== 'cdm_dwprobe') return;
target['written_at_sink'] = new Date();
}
}
The plugin runs in the PreOperation stage of Create so the stamp lands in the same transaction.
Alert thresholds that map to user experience
- Warning at 30 seconds: sales reps notice this when they refresh a record after editing F&O.
- Page at 5 minutes: bulk sync is happening or a map is wedged. Either way, on-call eyeballs.
- Page hard at 30 minutes: data is materially drifting. Stop dependent automation if you have a kill switch.
Push the lag metric to Application Insights as a custom metric, then alert from there. Do not use Power Platform monitoring alerts for this — the granularity is too coarse.
When you see lag, what is wrong
Three buckets of causes:
- Sustained write storm: bulk import or background job is producing more writes than the engine can drain. Throttle the source.
- A poisoned row: one row in the error queue is blocking propagation on its key. Find it in the error log, fix or skip, drain proceeds.
- Map stopped: someone disabled or reconfigured a map. Initial sync running again until it finishes.
The error queue is what you most often hit. The errors are clear once you find them. The problem is finding them — the admin UI lists errors per map but does not group by error message. We export to a SQL table and group there. The same five SKUs cause 90% of errors.
The error pattern we see most
A required field on the Dataverse side that is nullable on the F&O side. F&O writes null, Dataverse rejects. The fix is either:
- Make the Dataverse column nullable.
- Add a default in the dual-write map.
- Filter the row at the map level.
Pick the third option only when the row is genuinely irrelevant. Otherwise you create silent drift.
Network considerations
Dual-write traffic goes through the Power Platform network plane, not your VNet. Outbound proxies and firewalls do not see it. But the F&O side does run an outbound webhook for change events, and that path can be throttled by your egress controls if you have customized them. Check EgressFirewallLog in F&O if you see one-sided lag (writes propagate F&O → Dataverse fine, Dataverse → F&O fails).
The kill switch
When lag exceeds the page-hard threshold, automation that depends on cross-side consistency must stop. Build a flag in Dataverse — cdm_systemstatus.dualwrite_active = false — and have every dependent Power Automate flow check it as the first step. When the probe-based alert fires, your runbook flips the flag. Downstream automation pauses, no orchestrator dies on a stale read.
Pixel notes
Build a tiny model-driven dashboard that shows current lag, queue depth, and error count per map. Three widgets, refreshes every 60 seconds. The admins love it because the platform’s built-in view requires four clicks to surface the same data. Visibility is a forcing function for ownership.
Read also
For solution boundaries that constrain dual-write maps, see Dependency hell in solutions. Maps are themselves solution components and inherit the same hazards.
Key takeaways
- The admin center does not show end-to-end latency. Build a probe.
- Push lag as a custom metric, alert at 30s / 5m / 30m.
- Most errors are field-level mismatches; export the error log and group.
- Build a kill switch flag for dependent automation.
- One-sided lag often means egress firewall, not the sync engine.