Every workflow change creates a new version. Old in-flight items keep running on their original version. Within months, a busy workflow has 30 versions in the table and no one can confidently say which version a given record is on, much less roll back safely. The discipline is simple but rarely followed.
Number versions like releases
Adopt a versioning convention: major.minor.patch in the workflow description, not just in the auto-incremented version field. Major changes break compatibility. Minor changes add steps without changing behavior of in-flight items. Patches fix bugs. When something goes wrong, the version number tells you whether to roll back or roll forward.
Always document what changed
The workflow description field is also the change log. Every checkout adds a one-line entry: date, author, summary, ticket number. Without this, a year later no one remembers why version 17 differs from version 16. The diff in the platform is technical; the description tells the story.
Limit the number of active versions in flight
A workflow should not have more than three versions concurrently active across in-flight records. If you have five or more, cancel and restart old in-flight items on the current version. Long-tail records on ancient versions are where rollback nightmares originate.
Test rollback, do not just plan for it
A rollback plan that has never been executed is hope. Once a quarter, take a non-critical workflow, publish a benign change, and actually roll back from sub-prod. Time the operation, document the steps, and update the playbook. Rollback that takes three hours of trial-and-error in production is not a rollback; it is an outage.
In-flight migration is its own decision
Some changes can be retroactively applied to in-flight items by republishing. Some cannot. Decide explicitly per change: do new instances start on the new version while old ones finish on the old version, or do all in-flight items move to the new version? The default behavior is the former, which is usually right but not always.
Lock published versions
Once published, the version should be immutable. Edits go into a checkout that becomes a new version. Teams that edit in place break audit and break the in-flight semantics. Use the role boundary on wf_workflow_version to enforce this technically, not just by convention.
Rollback by republish, not by hand-edit
If a workflow needs to revert, find the previous version in wf_workflow_version, check it out, and republish. Do not hand-edit the current version to undo changes; that creates a third state that matches neither the new nor the old. Republishing keeps history clean.
Watch for stuck workflows after rollback
After any rollback, query for workflow contexts in wf_context with state executing and last update older than the rollback window. Some of those will be genuinely stuck; the engine does not always cleanly transition records when their workflow definition shifts under them. Build a recovery script and have it ready.
What to do this week: list your top 10 workflows by execution volume, count active versions per workflow, and clean up anything with more than three.