[object Object]

The user opened Virtual Agent on Tuesday, got partway through a request, hit a meeting, and came back Wednesday morning. Pre-2026, that meant restarting from scratch and re-explaining the situation. The 2026 multi-turn context release changes that — and changes the topic design assumptions that were tuned for single-turn sessions. The release is upside; the migration cost is real.

What Changed

The 2026 Virtual Agent supports multi-turn conversations with context retention across sessions. A user can return two hours or two days later and the agent remembers what they asked, what was attempted, what slot values were already provided, and where the conversation paused. The session state lives in sys_cs_conversation with a configurable retention window per topic.

Session resume policy:
  default: resume within 24h, prompt to continue
  extended: resume within 7d, prompt to continue
  short: resume within 1h, otherwise fresh start
  ephemeral: do not retain context across sessions

Why It Matters

The previous Virtual Agent forced repetition — each session started cold and the user re-supplied every piece of information they had given before. Users learned to avoid it and went straight to human agents, which defeated the deflection goal. Persistent context makes Virtual Agent usable for real resolution flows that span more than a single sitting, not just FAQ deflection. Containment rate ceilings that capped around 25% in single-turn deployments tend to move into the 40-50% range when topics are designed for resume.

What To Retrain

If your Virtual Agent was tuned around single-turn flows, revisit topic design. Multi-step topics can now span sessions; invite the user to continue where they left off instead of restarting. Slot-filling that previously had to fit inside one conversation can spread across multiple sessions. Topics that used to short-circuit to “transfer to agent” because the conversation got too long no longer need to — but the topic must be re-designed to take advantage.

// Resume topic at saved slot position
function resumeTopic(conversation) {
  var savedSlots = conversation.context.slots;
  var resumePoint = conversation.context.last_node;
  return startTopic({
    topic_id: conversation.topic_id,
    slots: savedSlots,
    start_node: resumePoint,
    greeting: 'Welcome back. Continuing where we left off...'
  });
}

Measure Before and After

Track containment rate (resolved without human), deflection rate (handoff prevented), average session length, multi-session resume rate, and return-user rate. Multi-turn context should lift all of them. If the metrics do not move after the upgrade, the bottleneck was topic quality, not session boundaries — multi-turn context cannot rescue topics that were poorly designed for any session model.

Privacy and Retention

Persistent context means the agent retains user-supplied information across sessions. Set explicit retention policies per topic — short retention for sensitive data, longer retention for routine workflows. Honor user-initiated forget requests by clearing both the conversation record and the derived context. The audit trail should record context retention duration and any forget operations, not just the conversation transcripts.

Common Failure Modes

Topics designed for single-turn that pile up partial state across sessions and confuse the user — refactor the topic for explicit resume semantics, do not just enable persistence on the old design. Resume prompts that ask “would you like to continue?” with no context — show what was asked and what was answered so the user can decide intelligently. Context retention windows set globally rather than per topic — short retention for password-related conversations, longer for project-tracking conversations.

What Changed in 2026

Beyond multi-turn, the 2026 release added optional Now Assist-powered intent reranking — the agent can refine its understanding mid-conversation as the user provides more context, rather than committing to the initial intent classification. This is a significant lift for ambiguous opening questions but adds token cost; instrument and budget per topic.

Implementation Sequence

Pick the highest-traffic topic with the highest abandonment rate at the second or third turn. Redesign for multi-turn resume, validate the new design against historical conversation logs (does it complete the historical examples), enable for one user group for two weeks, measure. Roll out to additional topics in priority order. Trying to enable multi-turn across every topic on day one produces inconsistent UX and a debugging nightmare when one topic’s state pollutes another.

What to do this week: pull the abandonment rate per Virtual Agent topic at each turn count; topics with high turn-3 abandonment are your multi-turn redesign candidates.

[object Object]
Share