The Case for WET Tests

I was recently working on a feature enhancement for a React/GraphQL/Apollo project. The basic premise of the feature was:

As a user negotiating in a marketplace, I want to be able to set an expiration date on my counter offer, and I only want to be able to select a valid date.

The feature was working in development, and I wanted to update the existing tests to account for the new functionality.

I navigated to the test file and found that the test file had a defined class object to automate the test set up. This class object was responsible for building the users and transactions required for this feature, as well as for completing and submitting the form. It’s sort of acting like a page object but forces you to always do all the things in a particular setup path.

In each describe block, all you had to do was construct an instance of the class with the proper arguments, tell it to run through the form, and make sure it passed all the assertions. It was as simple as:

const form = Form.withInputs();

The case for DRY tests

By the third day of being a developer, you’ve likely heard of the importance of writing DRY code– don’t repeat yourself, write code that can be reused, profit! DRY code is good code, and a goal to strive for. "Wanna See the Code" XKCD comic

But I like to think about DRY-ness like comic art. It’s important to learn the fundamentals, and then use that knowledge to know when you need to break the rules to achieve a specific effect. The DRY principle is a helpful foundation for learning good coding practices, but it can be a rule worth breaking in the interest of clarity. Sometimes it’s useful to keep your tests WET (“Write Everything Twice”).

At first glance, it’s easy to see why the author made this abstraction. It’s a complex bit of UI that requires setting up multiple models, filling out a form, and mocking at least one server request. It would be nice to be able to do all of that in a line or two.

But after working on this feature for a while and adding more tests, I ultimately decided that the lack of repetition in this code came with trade-offs that I was unwilling to make.

Writing tests that scale with your team

Programs must be written for people to read, and only incidentally for machines to execute.

– Harold Abelson, Structure and Interpretation of Computer Programs

Gaining context on a new area of the code base you haven’t worked in before is already a recurring challenge on large development teams. It’s also particularly vital when writing and maintaining tests.

If tests function to ensure that our application is working as expected, we must always be confident that the changes we make to it are the right ones. A green test suite is useless if it isn’t testing the behavior we want. If our test suite takes too long for a newcomer to understand, its a strain on velocity. If our team members don’t feel confident updating it, it’s a product liability.

Abstractions can be useful tools, but can also increase the cognitive load on the next person to work with your code. I find it helpful to always view a new abstraction as a trade-off between de-duplication and added complexity. Choosing the abstraction isn’t always wrong, but it’s important to be honest about how it might affect the next developer who has to use it.

In this case, it took me about 30 minutes to really understand what the class was doing, plus the usual time to understand what we are testing and why. This was a signal that the helper class might introduce undue complexity.

Minimizing dependencies between tests

Explicitly declaring vital setup steps like building props, mounting components, and navigating UI inside of an appropriately scoped describe block often means we get more helpful error messages. When each test sets up only the information it needs, it is much less likely to break in unpredictable ways and with vague errors. If a now required prop is missing, typescript will error on the props declaration. If you are missing a mock, the component won’t mount successfully. If an input you’re trying to change doesn’t exist, the tests will tell you so.

Keeping the test setup tightly scoped also decreases the likelihood that a change in the component’s behavior will break unrelated tests, because we’re only building what we need for each test to accomplish its goal. If we’re writing a test to make sure that a modal appears when a user clicks a button, we shouldn’t also have to worry that an input inside the modal can’t be changed just yet.

This was another key warning sign that our utility class was making our lives harder. The implementation of this feature had changed quite a bit, but the resulting UI change was very small– it added a date field with some validations. But it broke every test, because every test was using the same setup regardless of what it was actually testing. It also meant that where I expected an error that said “I expected the validations to pass, but they didn’t”, the error message simply pointed to the instantiation of the class, because something went wrong somewhere in there. The only way to debug was to put a bunch of console.logs in each step to track down the Mystery Guest responsible for the failure.

If this logic had been broken out into steps, it would have been easy to see that the component mounted correctly, but selecting the expiration date failed.

Requirements change, abstractions break

Duplication is far cheaper than the wrong abstraction

– Sandi Metz, “The Wrong Abstraction

At the time the author wrote this test, it was a great abstraction that saved several lines for each describe block. But once we identified a new business need that we had to support, we fundamentally changed the behavior of the feature, and by extension, the way we needed to test it.

This is not to say that you should never extract a helper because the business needs are going to change. Products evolve and requirements will inevitably change. As developers, we’re constantly thinking about how to write reusable code that can grow and scale over time. But over-abstracting in tests tends to be more expensive than in business code. After all, we probably won’t approach a thorny bit of source code and think, “Do I really have to build this feature?”. But we might approach a difficult to test component and think, “Do I really need to write a test for this?”.

As consultants, part of what we consider when choosing what patterns to apply is their maintainability over time. We find that prioritizing decisions that will create the least friction to writing tests increases productivity and decreases costs over time.

Writing untested logic

I generally only write utilities in tests if it can be accomplished with a pure function that doesn’t do any “heavy lifting” or complicated logic. My guiding principle here is usually asking, “Would I write a unit test for this function if it were in my source code?”. This is a process we use all the time when we prioritize how to test our features. Complex functions that are core to the business logic? Definitely write some tests. A helper function that just wraps some of JavaScript’s language features? Maybe not.

For instance, I feel comfortable not testing a simple function that builds my test component’s props:

interface ComponentProps {
  label: string;
  value: string;
  order: number;
}

const DEFAULT_PROPS: ComponentProps = {
  label: 'Test label',
  value: 'This is a test value',
  order: 1,
};

const buildProps = (props: Partial<ComponentProps>) => {
  return {...DEFAULT_PROPS, ...props};
};

In this case, I know that TypeScript is ensuring that I pass valid values into my utility function, and we shouldn’t spend time testing that language features (like JavaScript’s spread operator) work as expected. I’d give this function the green light. A helper function that builds data, interacts with the DOM, and has side effects introduces additional complexity: when we have a bug that was not caught by the tests, could it be because we introduced faulty, untested logic in our helper?

Being strategic about what kinds of helpers you introduce into your tests achieves the same purpose of writing tests in the first place– to reduce the amount of untested logic in your application. For longer test files, I’ll usually introduce helpers that each do one small part of the setup, such as building props, mounting a component, or locating and returning a nested element on the page. Once it goes beyond writing pure functions, I’d rather re-write a few lines in the appropriate scope than feel like I should write a test for my test function.

Making the right abstractions

The answer to the question “Should I extract this test setup into a helper function?” is often still “Maybe”. But a good follow up question is often “Why?”. Will the extraction make your tests more robust? More readable? Easier to update and write? Or will it just make the code more DRY? DRY test code should be a means to achieving the goals of your test suite, rather than a goal in and of itself.