I read this article recently, on making sure that your automation isn’t stale. Tests, like food, work best when they’re fresh.
When trying to figure out how to get an automation suite under control–getting rid of staleness–I like to visualize it as a hedge. We generally throw out stale food, but a hedge can be trimmed up and shaped nice.
Plus, I found the graphic and decided on the post title already, so I’m gonna roll with it.
A follow up question to the article was, what strategies would people recommend for getting rid of tests like this?
What They Are and How They Happen
First, we need to know what they even are. How would somebody say “this test needs to be pruned out”? What tests fit the criteria?
Second, knowing how they happen is important to prevention. If you can stop them from becoming stale in the first place, I bet that would solve a large part of the problem.
A test is ready to be pruned when it’s not providing much value.
For all of the examples below, read them through the lens of your current job and codebase. Not all of them will work for everybody.
There’s already a test that does something similar
The Cause: Duplicate tests happen when multiple people are writing tests independently. Or maybe it’s one forgetful person.
The Prevention: Have regular code reviews, or pull requests, or something, so people are not only aware of what tests are being written, but what kinds they are, too.
To Prune: Figure out what the most valuable parts of the tests are, and either get rid of all but one entirely, or cobble together a new one. And then let people know what you did.
It’s testing a dormant part of the system
The Cause: When new features are written, lots of activity is happening. Devs are changing and redeploying, and tests are being run and rerun. Automation helps with faster turnaround. But once the activity moves elsewhere in the system, these tests are still hanging around, still running.
The Prevention: Take a breath, and remember that soon, this part of the system is going to settle down. Determine just a small number of tests that you need, to prove that part of the system is working. Rely on your eyes and your brain to find bugs.
To Prune: Prune back the set of tests on that part of the system, to just a token number of tests. Test it with smoke-level testing, just to make sure nothing’s horrendously broken. Or, consider marking/tagging that test in some way to not have it run all the time, but only when needed.
“While we’re here…”
The Cause: When test automation is challenging to write, you may start seeing combinations of similar data being tested, just to be thorough. In reality, it’s sometimes about coverage/due diligence/increasing test count. I don’t have anything against JUnit Theories/Scenario Outlines/etc. but those are prime places for this anti-pattern to show up.
The Prevention: Work toward having test automation that’s really trivial to write. It not only speeds up how long it takes to write it, but mentally, people are more prepared to drop a test because it doesn’t represent a huge investment in time.
To Prune: This is similar to pruning out duplicate tests. Ask the question: is [this test] really going to find something that [that test] is NOT going to find? If the answer is “nope” then delete one of them. Or, if you really feel the need (and it’s reasonable), set up the remaining test to use randomized data each run.
There’s a “fear smell” about the tests
The Cause: Some people write a buttload of tests for fear of Missing Something. And nobody really knows what they’re looking for, just that they want to make sure nothing bad happens. The result is, some places are heavily tested for generic problems, and other places might be tested lightly. It really only provides a warm fuzzy feeling, instead of proving that code is release-ready.
The Prevention (and, To Prune, too): This is more a social solution than anything. People need to be less afraid of not writing tests, and be more aware of the tests they are writing. For each test, ask the question: What is this test being written for? What does it mean if it fails? If a test really doesn’t add any value, delete it.
It’s testing a low-risk part of the system
The Cause: In certain company cultures, it’s expected to have near-100% coverage of everything. But if we’re honest with ourselves, we don’t need that much coverage. We can even say further, that some bugs are… not really acceptable, but a lot less damaging than some others that can be caught.
The Prevention: Identify that parts of the system that really are high-risk and beef up testing there. But otherwise, ask the question: What would happen if [this particular bug] happened [here]? Would it be the end of the world? Would we lose a bunch of money? Would it just be a minor annoyance? Would the customer even really notice?
To Prune: Consider marking/tagging/quarantining tests like this and review them with your team. Historically, has there ever been a bug like this happening, and if so, what, if anything did the customer say about it? If it’s sufficiently low-risk, just delete it. Leverage the fact that customers value quick turnaround on fixing their problems too, so if something semi-important does make it out, and it’s fixed quickly, that’s favorable for you.
It’s testing a cosmetic change
The Cause: Sometimes, we need to make a cosmetic change to our interface. Maybe element location or type is changed. An automated test on this doesn’t add terribly much value, because this kind of change doesn’t happen often. If this test did fail, it’d probably be for a more global reason than “the element didn’t have the right color”, but would instead be, “the page literally didn’t load because the server is down”.
The Prevention: “Just because you can, doesn’t mean you should”. You might find more junior automators writing tests like this because it’s novel, and they can do it. And yeah it’s a good exercise but in the long run, doesn’t gain much since now it has to be maintained. Prevent these tests by just talking about what should/shouldn’t be automated, but also giving people enough interesting stuff to work on, to scratch the kind of itch that causes these kinds of tests.
To Prune: Really, you could prune by just not writing these tests in the first place.
The feature being tested is really mature
The Cause: Once code is released, the customers are like an army of unwitting testers. If they find something wrong with your product, you will be sure to hear about it. But, tests likely still exist around that feature even when that part’s been working great for awhile.
The Prevention: First, as you write your tests, mark or tag them so they can be isolated later. But once a feature’s been out in the wild for awhile, with no complaints, stop running those tests. Talk with your team about what a good time frame is for this–is it a month? 3? A year? If a group of customers who are basically mob testing your code haven’t found anything, it’s highly unlikely that your continually passing tests will either.
To Prune: If a feature’s been in the wild for an even longer (and agreed upon) amount of time, just remove the tests entirely.
The Cause: sometimes you’ll find these huge tests that do a whole bunch of stuff. Maybe it’s in the name of “well I needed to put some more functionality somewhere and this will work,” or, “I don’t want to teardown/setup every time so I’ll save time by putting it in one big test”. For every line of test code, the probability of failure increases. And if a test is doing too much and fails early, it’s possible that the rest of that God Test would have found something else, that now it missed.
The Prevention: Write tests with specific goals in mind, and put a description somewhere. Tests that do way more than what they say on the label are a telltale sign of this kind of test.
To Prune: Break up tests like this into smaller ones. They can run faster separately (which is great in the age of cloud computing and parallel processing) and can probably find more bugs separately, than all together.
The Cause: When a bug is found, it’s nice to be able to repeatedly have the test suite tell you “nope” until the bug’s fixed, so you know when to release. But bugs are usually pretty localized in the code, and once they’re fixed, they probably won’t un-fix in that exact same way. Having these tests hanging around offers little coverage.
The Prevention: If you need automation for seeing when bugs are fixed, that’s fine, but keep them separate from the rest so you can stop running (or just delete) them later. Ideally though, you probably want to have tests written at the unit or integration level.
To Prune: Quarantine these and then talk with the team to determine how risky that part of the code is, and whether this bug has a history of cropping up. If signs are favorable for deleting them, do so.
…But Keep The Roses
Hopefully you didn’t start pruning while reading this post, because it’s important to “keep the roses”–the interesting parts of the tests that have intriguing code or solutions implemented. Set these aside and decompose them later–tricks that have been put in those places might be helpful for future tests.
It’s important to practice good automation hygiene. Are you, your team or your company being slowed down by having to maintain a huge number of tests? I can help with that. Would you like to know how?