Automated Test Framework is the right tool for regression testing ServiceNow customizations. The trap is treating it like JUnit and trying to cover everything. ATF tests are slow to write and slow to run. Coverage strategy matters more than coverage percentage.
Cover the upgrade path, not the unit logic
ATF’s highest value is catching regressions during upgrades and update set promotions. Focus tests on end-to-end user journeys: submit a catalog item, complete an approval, resolve an incident with a specific assignment rule. These exercise the integration points that upgrades break. Unit-level tests of business rules duplicate what server-side script tests already cover.
Tier tests by criticality
Tag tests as Critical, Important, or Coverage. Critical tests run on every update set promotion and every patch. Important tests run nightly. Coverage tests run weekly. Without tiers, every test runs every time and developers eventually disable the suite because it slows down deployment.
Write data setup once, reuse forever
Data setup is the most fragile part of ATF tests. Build a small library of “Test Data Builders” as Script Includes that create users, groups, CIs, and incidents in a known state. Tests call the builders rather than creating data inline. When the data model changes, you fix the builder once instead of editing 80 tests.
Clean up after yourself, always
Every test should delete the data it created. Tests that leak data inflate the database, slow down subsequent runs, and create false positives when leftover data interferes with other tests. Use the built-in Impersonate and Set Up Roles steps with cleanup steps in the teardown phase.
Page Inspector for UI tests
UI tests recorded with the click-recorder are brittle. Use the Page Inspector to find stable selectors (sys_id-based, label-based) rather than relying on auto-generated XPath. A test that asserts “the Save button exists” is more durable than one that asserts “the third button in div .form-actions exists.”
Run in a clone, never in production
ATF can technically run in production. It should not. UI tests impersonate users, fire emails, and trigger workflows. Run the suite in a sub-prod clone refreshed weekly. The clone is also a useful proving ground for changes before they hit production.
Track flaky tests separately
Some tests fail intermittently due to timing, async events, or environmental flake. Tag flaky tests, route their failures to a separate queue, and either fix or delete them. Failing tests that are routinely ignored train the team to ignore all failures, which destroys the value of the suite.
Measure value, not lines
Report on test value: how many regressions caught last quarter, how much time saved versus manual UAT, how many flaky failures. Coverage percentage is a vanity metric. A suite of 50 well-targeted tests catching real regressions beats 500 brittle tests that nobody trusts.
What to do this week: tier your existing tests, identify the top three critical user journeys without coverage, and write one ATF test for the most important.