[object Object]

Dual-write is the synchronous bridge between Finance and Operations and Dataverse. The word “synchronous” is misleading — under sustained write load, the queue accumulates and you discover three hours later that customer records updated in F&O are not yet in Dataverse, and sales reps are quoting against stale credit limits. The platform does not page you. You build that yourself.

What dual-write actually is

Each dual-write map is a paired set of mappings between an F&O table and a Dataverse table. Writes on either side hit a sync engine that propagates the change. The engine has a queue per direction. Latency under nominal load is a couple of seconds. Under burst load — month-end invoice posting, a bulk import — the queue grows.

The four lag signals

  1. Initial sync running: a map is still in its first hydration. New writes do not propagate cleanly until it finishes.
  2. Queue depth: pending changes per direction per map. Normal is < 100. Concerning is > 5000.
  3. Error count: row-level failures sitting in the error queue. Each one blocks downstream rows on the same key.
  4. End-to-end latency: a sample row written on one side, measured arriving on the other.

The Power Platform admin center surfaces the first three at a glance and lies about latency. To know real end-to-end latency, you write your own probe.

The probe pattern

Create a paired probe table on both sides — cdm_dwprobe in F&O and cdm_dwprobes in Dataverse — with three columns: probe_id (string), written_at_source (datetime), written_at_sink (datetime). A scheduled Azure Function writes a row to the F&O side every 60 seconds with written_at_source = now. A Dataverse plugin on Create stamps written_at_sink = now when the row arrives. A second scheduled job reads the latest probe and computes lag = written_at_sink - written_at_source.

// Azure Function: write probe
import { app, Timer } from '@azure/functions';

app.timer('dualwriteProbe', {
  schedule: '0 */1 * * * *',
  handler: async () => {
    const probe = {
      probe_id: crypto.randomUUID(),
      written_at_source: new Date().toISOString()
    };
    await fetch(`${FO_BASE}/data/CdmDualWriteProbes`, {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${await getToken()}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(probe)
    });
  }
});
// Dataverse plugin: stamp arrival
export class StampProbeArrival implements IPlugin {
  execute(ctx: IPluginExecutionContext) {
    const target = ctx.inputParameters.Target as Entity;
    if (target.LogicalName !== 'cdm_dwprobe') return;
    target['written_at_sink'] = new Date();
  }
}

The plugin runs in the PreOperation stage of Create so the stamp lands in the same transaction.

Alert thresholds that map to user experience

  • Warning at 30 seconds: sales reps notice this when they refresh a record after editing F&O.
  • Page at 5 minutes: bulk sync is happening or a map is wedged. Either way, on-call eyeballs.
  • Page hard at 30 minutes: data is materially drifting. Stop dependent automation if you have a kill switch.

Push the lag metric to Application Insights as a custom metric, then alert from there. Do not use Power Platform monitoring alerts for this — the granularity is too coarse.

When you see lag, what is wrong

Three buckets of causes:

  • Sustained write storm: bulk import or background job is producing more writes than the engine can drain. Throttle the source.
  • A poisoned row: one row in the error queue is blocking propagation on its key. Find it in the error log, fix or skip, drain proceeds.
  • Map stopped: someone disabled or reconfigured a map. Initial sync running again until it finishes.

The error queue is what you most often hit. The errors are clear once you find them. The problem is finding them — the admin UI lists errors per map but does not group by error message. We export to a SQL table and group there. The same five SKUs cause 90% of errors.

The error pattern we see most

A required field on the Dataverse side that is nullable on the F&O side. F&O writes null, Dataverse rejects. The fix is either:

  1. Make the Dataverse column nullable.
  2. Add a default in the dual-write map.
  3. Filter the row at the map level.

Pick the third option only when the row is genuinely irrelevant. Otherwise you create silent drift.

Network considerations

Dual-write traffic goes through the Power Platform network plane, not your VNet. Outbound proxies and firewalls do not see it. But the F&O side does run an outbound webhook for change events, and that path can be throttled by your egress controls if you have customized them. Check EgressFirewallLog in F&O if you see one-sided lag (writes propagate F&O → Dataverse fine, Dataverse → F&O fails).

The kill switch

When lag exceeds the page-hard threshold, automation that depends on cross-side consistency must stop. Build a flag in Dataverse — cdm_systemstatus.dualwrite_active = false — and have every dependent Power Automate flow check it as the first step. When the probe-based alert fires, your runbook flips the flag. Downstream automation pauses, no orchestrator dies on a stale read.

Pixel notes

Build a tiny model-driven dashboard that shows current lag, queue depth, and error count per map. Three widgets, refreshes every 60 seconds. The admins love it because the platform’s built-in view requires four clicks to surface the same data. Visibility is a forcing function for ownership.

Read also

For solution boundaries that constrain dual-write maps, see Dependency hell in solutions. Maps are themselves solution components and inherit the same hazards.

Key takeaways

  • The admin center does not show end-to-end latency. Build a probe.
  • Push lag as a custom metric, alert at 30s / 5m / 30m.
  • Most errors are field-level mismatches; export the error log and group.
  • Build a kill switch flag for dependent automation.
  • One-sided lag often means egress firewall, not the sync engine.
[object Object]
Share