Testing in 3D

Automated testing is a facet of software engineering that most people agree holds value, but usually the engineer will wander off before offering any further clarification. There is somewhat of a blind faith that given full coverage, an engineer’s work is complete and they can continue solving the rest of the business’ problems.

The issue with carelessly creating tests without purpose is that changes and enhancements to a large system can become unnecessarily more complex and difficult over time. It is worth noting here I will not differentiate between unit and integration tests, as in practice the lines get very blurred depending on the practitioner. Instead, I will take a look at three conceptual purposes of automated tests in an effort to define what makes them useful.

Design

This is the category of tests most frequently created when practicing Test Driven Development. If you are not familiar with this, you can think of it as micro-specifications. You define what you are going to build by way of a failing test, it fails because you haven’t built it, and then you make what you said you were going to make. The test passes and you repeat.

Design tests are helpful for making interactive notes and / or plans for your code in the event you get distracted. They are also extremely appropriate when prototyping, research spiking, and concept-proving. Instead of pushing data all the way through a system to identify if your change had the desired effect, you can measure it in some degree of isolation. Design tests also reduce the likelihood of you creating a proof of concept then dropping it on a more junior engineer to clean up after you.

Not all codebases are appropriately set up for this. If there is very tight coupling in inappropriate places, it can be harder to generate more surgical design tests. With some time, care, and effort a codebase can be reshuffled to make smaller validation points more natural. Those kinds of changes should always be considered valuable refactoring efforts.

Document

Good code documents itself. This is an evergreen lie that software engineers love to tell each other. It does not. Code is sometimes legible to the original author. Sometimes even up to a week or two later. While the commit message and the associated tickets can provide more supplementary context, consider using tests to also cleanly illustrate the purpose of a set of methods.

Documentation tests differ from Design tests as the intended reader should be at a broader scope. It is very important that Documentation tests do not traverse a maze of inherited setup. There is a fallacy amongst some developers that DRY applies to tests. When abstracting away setup, you force the reader of a test to track down all aspects that may be affecting the current operation. Imagine having to read this entire post to understand a single sentence. Then starting over the entire post to understand this next sentence. As entertaining as I am, that is still pretty brutal.

Instead, abstract away only the parts that are irrelevant. If they are irrelevant, can something like dependency injection abstract it away further? This can lead to code that leans into testability. Embrace testability. There is a benefit both from the validation, and the predictable structure for readers.

Defend

If you are like me, your code always runs flawlessly the first time. The second time it runs, it is even better. After you increase the load 10x, the CPU is reprogramming its own microcode in awe. But, as a thought experiment, let’s pretend we make mistakes. Imagine that we get woken up at 3am due to a production error which has led to nefarious computer hackers, undoubtedly wearing hoodies, breaching the system. Since very clearly i should only be below 200, we notice problem, fix it, and then go back to blissful sleep.

Or we could write a very specific test. A test where when i is allowed to equal 200 and terrible bad things happen. This test must reference a ticket or incident number in a very clear way, by name or comment or whatever your team agrees on. What we are trying to do here is to signal to all readers:

This is real life, real life is messy. If you changed something and this test failed, you were about to make an unexpected mistake. That mistake may ruin a dream where you are, in fact, a puppy.

As more participants enter a codebase, they all want to put their touches into it. By establishing a set of defensive tests, you create an environment where that is a safe endeavor. Just have an agreed upon prefix or standard that communicates to later readers these defensive tests were born out of production events. Treat these tests with more care and respect than other tests. Someone may have literally lost sleep over them.

Bonus Dimension: Deleted

Why limit ourselves to a boring, predictable three dimensions? The most exciting type of test is the one that can be deleted. I’m not advocating taking a clean slate approach and wiping everything out, but I am suggesting that tests are not necessarily permanent.

Design tests are the most prime candidates for deletion. By their nature, design tests are fragile. They tend to be generated in a stream of consciousness for the benefit (only) of the original author. If the true purpose has been accounted for by another test, send the design tests on their way. They should be considered as scaffolding; it serves a purpose during construction, but makes life less pleasant after everything is built.

Documentation tests, and subsequently defensive tests, tend to linger. These should be clear to the reader when they no longer belong in the system. If a test refers to a feature that has been deprecated, it is a good time to remove it. Sometimes, these types of tests can aid removing dead code as it will document a subsystem that is now vestigial.

Mind the machinery

A distant cousin of design tests are machinery tests. These are the types of tests used to prototype and inspect a 3rd party dependency, library, or service. While good for learning, these have absolutely no place in your project. The truth is that if your SQL query worked once, it will continue to be valid SQL for every commit moving forward. Unless you live in the make-your-own-rules anarcho-paradise that is untyped JavaScript, you should be able to trust that your dependencies are stable enough to not break your builds. No value is generated by validating these types of things, but countless person-hours are lost trying to diagnose them.

Good tests tell their own story

You can’t judge a book by it’s cover. But you can, and you will. The name of a test will be your first, and sometimes only, impression of what is going on. Ensure your tests have descriptive and accurate titles. Convince later readers that you had at least a semblance of an idea of what you were trying to do.

In the same vein, your test should tell a story. It may not be an exciting one, but it should have a cohesive plot. Set the stage for what is about to happen. Then, the big reveal, everything worked! Clean up the loose ends and continue on with the sequel. If there is too much going on in one test, try to refactor or abstract away the parts that do not add to the narrative. When this is pervasive, the code can naturally restructure into a more readable state.