It’s possible to write too much test automation.
I get that it’s cool and fun to come in and slam out code that does work for you so you don’t have to. Getting to learn new tech to get the job done is a great feeling.
That feeling is kind of like the game Mega Man, when you destroy the boss robot at the end of the level, and you get to use his weapon for the rest of the game.
But, as with any code, each line that you write has to be maintained.
That’s a problem if you’re writing a ton of automation. If it’s not maintained in favor of writing new automation, then your current automation can suffer from bit-rot.
Another problem that comes up is when multiple tests fail, and it turns out it’s for the same reason. You’ll have to spend time and money investigating what happened, which takes away from other fun stuff.
If writing lots of automation will be a problem, then a solution for that is to write less automation. The question is: what automation do you write?
This leads me to a concept called Tripwire Testing.
I don’t know if this concept is already A Thing or if there’s an official term for it. But I’ll do my best to explain. Maybe a commenter will see this and say, “Oh you’re talking about [that].” I kind of hope so.
I’m sure you know what a tripwire is. There’s a picture of one at the top of this post (or at least, the picture offered by Google). You see them in movies, the really cool ones are made out of infrared lasers and you need special glasses and canned air to see them. Those things. All they do is let you know when someone or something has crossed a threshold. That’s it.
When the tripwire is tripped, then you go investigate. Maybe a camera flips on so you can see what happened. WHAT IF IT’S A BAD GUY! Nope, it’s just a cat. Or maybe you send a detachment of armed soldiers. Cats can be dangerous. Either way, you address the issue when it happens, not maybe if it happens.
For software then, instead of testing the condition by looking everywhere sequentially (even if it’s really quick), you, the human, inspect the area whenever a qualifying event happens (such as a particular test failing).
Like a tripwire going off.
Ok, so what does this look like?
First, recall that the purpose of automation is to remove repetitive work–in this case, having to run tests manually.
Second, pulling this kind of automation off will require you have a firm knowledge of your code base. Automation’s job is not to tell you information about the system, that you don’t know how to find out yourself.
Tripwire automation has the following characteristics:
- a certain type of failure should only show up once per group of tests run
- every test written should give you unique info about the code under test
Checking for Certain Failure Types
If you had 50 tripwires across a door, and someone walked through it and tripped them all, what would happen? What would happen if something smaller walked through and tripped only 5?
Wouldn’t it mean the same as if one tripwire was strategically placed so that anytime anything walked through the door, it would go off? You’d still have to check.
In the same way, if you have 50 tests that all fail for the same reason, you’d have to go check. Unlike a physical tripwire though, you’d have to investigate all the failures. What a colossal waste of time!
That says there a different problem. Maybe you have some bad data in a database, or a webservice that’s upstream that’s misbehaving.
A second tripwire is needed in this case. Create a test that checks for that specific problem, then make the first set of tests dependent on this new test passing.
I’m glad I re-read the beginning of the post, because I mentioned that having lots of automation code that’s not kept maintained can suffer from bit-rot. Almost forgot to address that down here.
As part of adopting this style of automation, it’s important to take stock of what each test, and each suite of tests, are doing.
If you have a suite of tests, what is the whole thing trying to test? Does it look like there’s a test for every possible combination of parameters?
Why? Are that many tests required, or did the tests get written with thoroughness and due diligence in mind?
They probably meant well, but again, there’s no sense having multiple tests that look for the same kind of bug.
What kind of bug is being looked for? Is it possible to pare down the tests to something more strategic?
An example is one we have at my current client. The developers on the team have a simple test for the team’s webservices. All they do is make sure the services are up and the ones that need to talk to each other, can.
We run those tests first, and they take maybe 1 minute to run. If any of them fail, none of the 100s of subsequent tests get run.
Now that’s what I call failing fast. Can you imagine how much time and frustration it would be, to run the 100s of tests after that, without knowing if all the services are up, and groups of tests fail? It’d be very expensive to troubleshoot those tests.
…where can you apply this concept in your code base?