The number one cause of node restarts I see during code reviews is a while(gr.next()) loop over a million-row table without a thought to memory. setLimit() alone does not save you — it caps rows but still materializes the working set in JVM heap if you sort or join.
The naive loop that kills nodes
var gr = new GlideRecord('incident');
gr.addQuery('state', '!=', 7);
gr.query();
while (gr.next()) {
// expensive per-row logic
}
On a table with 4M active rows, this will silently degrade response time on the entire node. Stats jobs, scheduled jobs, and inbound REST will all queue behind it.
Use chunked sys_id windowing
The pattern that scales is sys_id-keyed windowing. Sort by sys_id, grab N, remember the last sys_id, repeat.
var lastId = '';
var batch = 2000;
while (true) {
var gr = new GlideRecord('incident');
if (lastId) gr.addQuery('sys_id', '>', lastId);
gr.orderBy('sys_id');
gr.setLimit(batch);
gr.query();
if (!gr.hasNext()) break;
while (gr.next()) {
lastId = gr.getUniqueValue();
// work
}
}
This avoids deep OFFSET queries (MySQL’s offset cost is linear) and lets you checkpoint between chunks.
When GlideAggregate beats GlideRecord
For counts, sums, or grouped reads, never use GlideRecord plus a counter. GlideAggregate pushes the work to the database.
var ga = new GlideAggregate('incident');
ga.addAggregate('COUNT', 'assignment_group');
ga.groupBy('assignment_group');
ga.query();
Same answer in 200ms vs 90s on a typical PDI-sized dataset.
Watch the dot-walk trap
Every dot-walk in a loop body (gr.assignment_group.manager.email) triggers a lazy fetch. On a 50K-row chunk, that’s 50K extra queries. Pre-resolve into a Map keyed by sys_id before the loop, or use GlideQuery with explicit select() columns.
What to do this week
Audit sysauto_script for any while(gr.next()) over tables with more than 500K rows. Refactor the top three offenders to sys_id windowing and watch the sys_node_stats heap graph flatten by Friday.