Table of Contents
Building stateless workflows with the Microsoft Agents Framework requires understanding how workflows pause, persist, and resume across independent executions.
This post focuses on Request Ports, Checkpoints, and the replay behaviour that underpins stateless workflow execution. By combining these pieces, you can build durable, human-in-the-loop workflows that continue reliably even when each step happens in a new process or request.
You can get the code here
Request Ports
Request Ports provide a controlled way for a workflow to pause its execution and wait for information from an external source—typically a human user or another system. When the workflow reaches a Request Port, it emits a Request Message, which is sent out of the workflow to whatever component handles user interaction (like a chat UI, API, or custom handler).
Once the request is sent, the workflow stops executing. No further workflow steps run until a corresponding Response Message is received. This response resumes the workflow exactly where it left off, allowing it to continue with the newly provided information.
At a high level, Request Ports allow your workflow to:
- Pause execution at specific points.
- Ask for missing information or human input.
- Resume seamlessly once the required data is returned.
They effectively create a “human-in-the-loop” moment while keeping the workflow state stable and predictable.
You create the Request Port and then add it as an Edge to your workflow. The Request Port is connected back to the Node that called it. In this case, the ActNode.
private async Task<Workflow<ChatMessage>> BuildWorkflow()
{
var requestPort = RequestPort.Create<UserRequest, UserResponse>("user-input");
var reasonNode = new ReasonNode(reasonAgent);
var actNode = new ActNode(actAgent);
var builder = new WorkflowBuilder(reasonNode);
builder.AddEdge(reasonNode, actNode);
builder.AddEdge(actNode, requestPort);
builder.AddEdge(requestPort, actNode);
builder.AddEdge(actNode, reasonNode);
return await builder.BuildAsync<ChatMessage>();
}The ActNode sends the User Request message which is routed to the Request Port.
await context.SendMessageAsync(new UserRequest(cleanedResponse), cancellationToken: cancellationToken);And the ActNode has a handler to receive the User Response.
public async ValueTask HandleAsync(UserResponse userResponse, IWorkflowContext context,
CancellationToken cancellationToken = new CancellationToken())
{
await context.SendMessageAsync(new ActObservation(userResponse.Message), cancellationToken: cancellationToken);
}Checkpoints
Checkpoints capture the state of the workflow at a specific moment so that execution can be paused and later continued without losing context. In this post we wired this up with a custom CheckpointStore that persists each checkpoint using the combination of runId and checkpointId.
When the workflow hits a boundary such as a Request Port, the framework calls into the ICheckpointStore and asks it to persist the checkpoint. That means the state at that super‑step is stored, but nothing is resumed automatically.
If you want to resume later (for example, when a response to a Request Port arrives), you have to:
- Track, in your own application, which super‑step (or conversation turn) maps to which checkpointId.
- Look up the correct checkpointId for the run you want to continue.
- Ask the workflow (via the in‑process execution APIs) to resume the stream from that checkpoint.
So checkpoints give you durability and a stable resume point, but your application is responsible for managing the mapping and explicitly triggering the resume, especially when wiring request/response flows back into a long‑running conversation.
We manually track the last Check point from the completed Super Step.
if (evt is SuperStepCompletedEvent superStepCompletedEvt)
{
var checkpoint = superStepCompletedEvt.CompletionInfo!.Checkpoint;
if (checkpoint != null)
{
CheckpointInfo = checkpoint;
}
}This can then be persisted and used to resume the workflow on the next request.
Resuming a Stateless Workflow
Microsoft's documentation shows how to resume a workflow using a stateful in‑memory execution model. In their example, the workflow receives a RequestInfo event (triggered by a Request Port), automatically constructs a response, and sends it straight back into the workflow. Because everything happens in the same process and in memory, the workflow continues immediately.
However, this model is not stateless. It assumes that:
- The workflow instance is still alive in memory.
- The engine already knows exactly where execution last paused.
- No checkpoint re-hydration is required.
For a real‑world, stateless workflow architecture, you must manage these pieces yourself.
To resume a stateless workflow, your application must:
- Track workflow state externally, so you know that the workflow paused at a Request Port and is waiting for user input.
- Retrieve the correct checkpointId that corresponds to the paused super‑step.
- Re-hydrate the workflow by providing both the runId and checkpointId to the workflow engine.
- Resume the in‑process execution stream, which restores the workflow to the exact point where it paused.
- Inject the user’s response as the Response Message for the Request Port, enabling the workflow to continue.
In other words, the workflow framework provides the ability to pause and resume—but in a stateless system, your application orchestrates the resume process by mapping user responses back to the correct checkpoint and explicitly restarting execution.
Based on the state of our workflow we decide to start or resume.
public static async Task<Checkpointed<StreamingRun>> CreateStreamingRun<T>(this Workflow<T> workflow, T message, WorkflowState state, CheckpointManager checkpointManager, CheckpointInfo? checkpointInfo) where T : notnull
{
switch (state)
{
case WorkflowState.Initialized:
return await StartStreamingRun(workflow, message, checkpointManager);
case WorkflowState.WaitingForUserInput:
return await ResumeStreamingRun(workflow, checkpointInfo, checkpointManager);
default:
throw new ArgumentOutOfRangeException();
}
}Understanding Workflow Messages Replay
When a Request Port emits a RequestInfo event, the workflow immediately pauses and persists its state to the checkpoint store. At this moment, the workflow returns control to your application, and the workflow engine has effectively stopped. The workflow is now in a state that is specifically marked as waiting for user input.
Later, when the user submits their response—for example, in ASP.NET Core, via a completely new HTTP request—the workflow does not automatically resume. Instead, your application must:
- Look up the stored workflow state to determine that the workflow is indeed waiting for a response.
- Re-hydrate the workflow using the stored RunId and CheckpointId.
- Resume the workflow's in-process execution stream so that execution continues from the paused super-step.
When the workflow resumes, the Request Port’s RequestInfo message is automatically replayed. This replay ensures that the workflow sees the same message it saw before pausing. However, this time, because your application now has the user's actual input, you provide the corresponding Response Message instead of waiting.
This replay-and-respond pattern is the key to handling stateless workflows: the workflow sees the same request message twice, but only the second time has the external data needed to continue execution.
Based on the state of our workflow we decide if we are receiving a request from the agent, or we are sending the human response.
if (evt is RequestInfoEvent requestInfoEvent)
{
switch (State)
{
case WorkflowState.Executing:
{
var response = requestInfoEvent.HandleRequestForUserInput();
State = WorkflowState.WaitingForUserInput;
return response;
}
case WorkflowState.WaitingForUserInput:
{
var resp = requestInfoEvent.Request.CreateResponse(new UserResponse(message.Text));
State = WorkflowState.Executing;
await run.Run.SendResponseAsync(resp);
break;
}
}
}Until Next Time
Stateless workflows in the Microsoft Agents Framework hinge on three core mechanics: Request Ports that pause execution and ask for external input, Checkpoints that persist the workflow state, and replay behaviour that restores the workflow’s context when it resumes.
By explicitly managing workflow state, storing checkpoint IDs, and resuming execution only when the correct user input is available, you create workflows that are robust, scalable, and able to survive process restarts or distributed architectures.
Together, these patterns allow you to build real-world, production-ready agent workflows that seamlessly integrate human input while remaining fully stateless and resilient.