Skip to content

TDD with Microsoft Agent Framework, Part 1: xUnit Agent Testing

Learn how to use xUnit and Test Driven Development to validate Microsoft Agent Framework agents. Ensure your AI selects the right tools and handles state correctly for production.

Monineath Horn - Unsplash

Table of Contents

This post demonstrates how to use xUnit to test the behaviour of your Microsoft Agent Framework (MAF) Agents and validate that they make the correct function calls.

If you want to move beyond the console and build production-ready applications, you must validate the behaviour of your agents and the broader system. While agents can evaluate history and state to select tools, we need to ensure they consistently select the right tools for the job.

Today, we will look at using xUnit to validate that our agents are selecting the correct tools through a structured TDD approach.

Travel Planning Use Case

In this post and others in this series, we will develop and implement a travel planning use case using xUnit, TDD, and the Microsoft Agent Framework (MAF).

We can define our Travel Planning Agent to behave like this:

The Agent uses a Travel Plan model as its source of truth for state and uses this model to base its decisions. The logic goal is simple: if the model is missing required data, the agent must use its available tools to request that information from the user.

The Travel Model contains the following properties:

  • Origin
  • Destination
  • Departure Date
  • Return Date
  • Number of Travelers

The Agent is provided with a single tool:

  • Request Information: Used when the agent identifies that it requires more information to complete a plan.

This use case has been deliberately kept simple so we can set up and test our agent. In later posts, we will add more functionality, tools, and complex behaviour.

Getting Setup

Language Model Settings

The agent factory in the samples uses Token Credentials to avoid API keys, and takes the LLM settings on Azure in the constructor.

public AgentFactory(IOptions<LanguageModelSettings> settings)
{
    var credential = new ChainedTokenCredential(
        new VisualStudioCredential(),
        new AzureCliCredential(),
        new AzureDeveloperCliCredential()
    );

    _chatClient = new AzureOpenAIClient(new Uri(settings.Value.EndPoint), credential)
        .GetChatClient(settings.Value.DeploymentName);
}

I’ve created a helper method in the test project to get the settings.

public static class SettingsHelper
{
    private const string DeploymentName = "LanguageModelSettings:DeploymentName";
    private const string Endpoint = "LanguageModelSettings:EndPoint";

    public static IOptions<LanguageModelSettings> GetLanguageModelSettings()
    {
        var configuration = new ConfigurationBuilder()
            .AddUserSecrets<PlanningAgentTests>()
            .Build();

        var deploymentName = configuration[DeploymentName];

        var endpoint = configuration[Endpoint];

        ArgumentException.ThrowIfNullOrEmpty(endpoint, Endpoint);
        ArgumentException.ThrowIfNullOrEmpty(deploymentName, DeploymentName);

        var languageModelSettings = 
          Options.Create(new LanguageModelSettings
          {
              DeploymentName = deploymentName,
              EndPoint = endpoint,
          });

        return languageModelSettings;
    }
}

And local user secrets setup on the test project.

"LanguageModelSettings": {
  "DeploymentName": "<Your-Deployment-Name>",
  "Endpoint": "<Your-Endpoint>"
}

Travel Planning Test Strategy

To validate our agent, we use a standard Arrange-Act-Assert pattern. Our goal is to prove that when the agent receives an incomplete state, it correctly identifies the missing data and invokes the appropriate tool.

Arrange

We initialize the state with partial information:

  • Provided: Destination, Departure Date, and Number of Travelers.
  • Missing: Return Date and Origin.

We then instantiate the Travel Planning agent and register the RequestInformation tool.

Act

We pass the incomplete Travel Plan to the agent and execute the RunAsync method.

Assert

We verify the agent's response meets four criteria:

  1. Tool Selection: It identifies that a tool call is necessary (specifically RequestInformation).
  2. Payload Structure: The tool call contains the expected argument key: requestInformationDto.
  3. Type Safety: The arguments are correctly deserialized into the RequestInformationDto type.
  4. Logic Accuracy: The tool call explicitly requests the two missing pieces of data: Origin and ReturnDate.

The Agent Test

The sample code base contains a number of helper methods to keep the tests clean and readable. This is particularly important for two reasons:

  1. It makes it easy for humans to read and understand.
  2. It provides clean, structured examples for AI coding assistants like Claude Code or GitHub Copilot.

To make the assertions as expressive as possible, I asked GitHub Copilot to extend FluentAssertions specifically for agent tool responses. My first attempt was messy because my instructions were too vague, but once I had a clear vision of the end result, I gave Copilot step-by-step instructions on the structure and purpose of the extension.

The result is a test that reads almost like a specification:

  [Fact]
  public async Task PlanningAgent_ShouldRequestMissingInformationToolCall_WhenTravelPlanIsIncomplete()
  {
      var languageModelSettings = SettingsHelper.GetLanguageModelSettings();

      var templateRepository = InfrastructureHelper.Create();

      var agentFactory = new AgentFactory(languageModelSettings);

      var template = await templateRepository.LoadAsync(PlanningYaml);

      var agent = await agentFactory.Create(template, PlanningTools.GetDeclarationOnlyTools());

      var chatMessage = TravelPlanHelper.CreateTravelPlanMessage(_travePlanState);

      var response = await agent.RunAsync(chatMessage);

      response.FunctionCalls()
          .Should().HaveCount(1).And
          .ShouldContainCall(RequestInformationToolName).And
          .ShouldHaveArgumentKey(ToolCallArgumentKey).And
          .ShouldHaveArgumentOfType<RequestInformationDto>(ToolCallArgumentKey).And
          .ShouldHaveRequiredInputs(ToolCallArgumentKey, _expectedKeys.Count, _expectedKeys);
  }

Until Next Time

Testing the behaviour of AI Agents is often seen as a challenge because of their non-deterministic nature. However, by focusing on tool selection and state validation, we can build a robust safety net for our AI-driven features. Using xUnit and custom FluentAssertions allows us to keep our tests expressive and maintainable, ensuring that our Travel Planning Agent behaves exactly as expected before it ever hits production.

In the next part of this series, we will look at more complex scenarios, including multi-turn conversations and integrating more advanced tools.

You can find the code sample here

Comments

Latest