Data driven tests by convention

April 22nd 2022 Unit Testing .NET

It pays in the long run to learn about the various capabilities of unit testing frameworks and use them to make unit testing code more maintainable. Let us go through the process of refactoring a set of copy-pasted tests into a single parameterized, i.e. data-driven test.

The method under test is a bit contrived, but chosen to illustrate the difficulties we may encounter when writing data-driven tests. It accepts a string as input and returns a structured result:

public class WordAnalyzer
{
    public WordAnalysis Analyze(string word)
    {
        // ...
    }
}

public record WordAnalysis(
    string Word,
    int Length,
    IReadOnlyDictionary<char, int> LetterCount);

The naïve approach to testing would likely result in a series of similar tests that check the outputs for different inputs:

[Test]
public void AnalyzeWorksForNUnit()
{
    var word = "NUnit";
    var expected = new WordAnalysis(word, 5, new Dictionary<char, int>
    {
        ['i'] = 1,
        ['n'] = 2,
        ['t'] = 1,
        ['u'] = 1,
    });

    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

[Test]
public void AnalyzeWorksForTest()
{
    var word = "Test";
    var expected = new WordAnalysis(word, 4, new Dictionary<char, int>
    {
        ['e'] = 1,
        ['s'] = 1,
        ['t'] = 2,
    });

    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

To add a new test, we would need to copy an existing test and change the input word and expected results. With this approach, it is not immediately obvious to the reader that these tests differ only in the input and expected result, but are otherwise identical.

Unit testing frameworks provide built-in support for such cases, which makes the intent much clearer. We can create a single test method with parameters for different inputs and outputs. The parameter values are specified using attributes. We usually call such tests data-driven tests:

[Test]
[TestCase("NUnit", 5, "i1,n2,t1,u1")]
[TestCase("Test", 4, "e1,s1,t2")]
public void AnalyzerWorksForWordsFromAttributes(
    string word,
    int wordLength,
    string letterCount)
{
    var expected = new WordAnalysis(
        word,
        wordLength,
        letterCount.Split(',').ToDictionary(
            pair => pair[0],
            pair => int.Parse(pair.Substring(1)))
    );

    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

Since we cannot define a dictionary in an attribute, I encoded the expected values for the dictionary in a string and converted this string into a dictionary in the test code. This makes the meaning of the test parameters less obvious to the reader. A more complex data structure would make this approach even more difficult.

We can get around the limitations of attributes by specifying the parameters in a separate method instead. This way, we can use the entire C# language. The attribute can then only refer to the method that provides the values:

private static IEnumerable<WordAnalysis> GetAnalyzerWords()
{
    yield return new ("NUnit", 5, new Dictionary<char, int>
    {
        ['i'] = 1,
        ['n'] = 2,
        ['t'] = 1,
        ['u'] = 1,
    });
    yield return new ("Test", 4, new Dictionary<char, int>
    {
        ['e'] = 1,
        ['s'] = 1,
        ['t'] = 2,
    });
}

[Test]
[TestCaseSource(nameof(GetAnalyzerWords))]
public void AnalyzerWorksForWordsFromMethod(WordAnalysis expected)
{
    var actual = new WordAnalyzer().Analyze(expected.Word);

    actual.Should().BeEquivalentTo(expected);
}

Now the only remaining responsibility of the test method is to call the method under test and assert the returned result. This is very simple to understand. The test inputs and outputs are specified in a separate method where we can now use the dictionary initializer syntax instead of having to find a way to encode the values in a string.

Often this is good enough. But now imagine that the method under test is a REST endpoint and not a normal method. In this case, it might make more sense to specify the expected result as a JSON, since that's what the endpoint will return:

{
  "Word": "NUnit",
  "Length": 5,
  "LetterCount": {
    "i": 1,
    "n": 2,
    "t": 1,
    "u": 1
  }
}

C# syntax makes it inconvenient to define JSON literals in code. I think a better approach is to add them as content files to the test assembly. Our method that returns the test inputs and outputs can then read these files instead of having the values hardcoded:

private static IEnumerable<TestCaseData> ReadAnalyzerWords()
{
    var testFilesFolder = Path.Combine(
        TestContext.CurrentContext.TestDirectory,
        "WordAnalysis");
    var testFiles = Directory.GetFiles(testFilesFolder);
    foreach (var testFile in testFiles)
    {
        var json = File.ReadAllText(testFile);
        var wordAnalysis = JsonSerializer.Deserialize<WordAnalysis>(json);
        if (wordAnalysis != null)
        {
            yield return new TestCaseData(
                Path.GetFileNameWithoutExtension(testFile),
                wordAnalysis);
        }
    }
}

[Test]
[TestCaseSource(nameof(ReadAnalyzerWords))]
public void AnalyzerWorksForWordsFromFiles(string word, WordAnalysis expected)
{
    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

The method assumes that each file in the specified folder is a case for that particular test. A new test can now be added by creating a new file in that folder. Its name is the input for the method under test and its content is the expected result.

In the examples above, I used NUnit. But you can achieve the same in the other two major unit testing frameworks for .NET with only a slightly different syntax.

This is the last test in MSTest:

private static TestContext _testContext = null!;

[ClassInitialize]
public static void ClassInitialize(TestContext testContext)
{
    _testContext = testContext;
}

private static IEnumerable<object[]> ReadAnalyzerWords()
{
    var testFilesFolder = Path.Combine(
        _testContext.DeploymentDirectory,
        "WordAnalysis");
    var testFiles = Directory.GetFiles(testFilesFolder);
    foreach (var testFile in testFiles)
    {
        var json = File.ReadAllText(testFile);
        var wordAnalysis = JsonSerializer.Deserialize<WordAnalysis>(json);
        if (wordAnalysis != null)
        {
            yield return new object[]
            {
                Path.GetFileNameWithoutExtension(testFile),
                wordAnalysis
            };
        }
    }
}

[TestMethod]
[DynamicData(nameof(ReadAnalyzerWords), DynamicDataSourceType.Method)]
public void AnalyzerWorksForWordsFromFiles(string word, WordAnalysis expected)
{
    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

The only notable difference is how the test context is accessed. In MSTest, it is not a static property, so it must be stored in a static field of the ClassInitialize method, which can receive it as a parameter.

And here is the same test in xUnit:

private static IEnumerable<object[]> ReadAnalyzerWords()
{
    var testFilesFolder = Path.Combine(
        Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!,
        "WordAnalysis");
    var testFiles = Directory.GetFiles(testFilesFolder);
    foreach (var testFile in testFiles)
    {
        var json = File.ReadAllText(testFile);
        var wordAnalysis = JsonSerializer.Deserialize<WordAnalysis>(json);
        if (wordAnalysis != null)
        {
            yield return new object[]
            {
                Path.GetFileNameWithoutExtension(testFile),
                wordAnalysis
            };
        }
    }
}

[Theory]
[MemberData(nameof(ReadAnalyzerWords))]
public void AnalyzerWorksForWordsFromFiles(string word, WordAnalysis expected)
{
    var actual = new WordAnalyzer().Analyze(word);

    actual.Should().BeEquivalentTo(expected);
}

There is no test context in xUnit. Instead, the location of the test assembly is used to find the content folder containing the test cases.

I have put the full code for all three test frameworks in my GitHub repository. If NUnit is not your testing framework of choice and you want to see the code for approaches as well, you can read it there.

You should treat your unit test code with the same care as your application code. You also need to maintain it and read it when a test fails to find out what it was testing and why it failed. Making your tests easier to understand when you write them will make this process much more efficient.

Get notified when a new blog post is published (usually every Friday):

Copyright
Creative Commons License