Rebuttal: “My Software is Free of Defects!” Part 2

(Part 1 is here)

When I wrote the previous post, I didn’t realize how many people would feel so strongly that what I was proposing was impossible.

I’d thought that this concept was largely considered an ideal, or a pipe dream, but ended up having many people vehemently opposed to what I was saying. Many of the arguments came from my breakdown of the examples in the original article.

For this pass, I’d like to go a level deeper, focusing more on the statement I took issue with, and less on example.

The Original Statement

The original statement made was, software can have a huge number of input combinations–so much so, that it would be impossible to test them all in a reasonable amount of time. This appears to be a core statement about why software can never be considered defect-free.

Let me start by saying: I agree with this statement. I agree that it would take an unrealistic amount of time to test through all the combinations.

What I disagree with are the extrapolations that come from this statement–ones such as:

  • There are an infinite number of combinations,
  • All input combinations need to be tested before software can be considered defect-free,
  • Software has to be proven defect-free in order to be defect-free.

Defining Infinity

When describing the number of input combinations, we’re talking about discrete numbers.

Let’s say you had a form with 100 million dropdown lists, each with 100 million choices. There would be a set amount of combinations you could get from that data set.

It’s a huge number. But it’s not infinite.

When people use “infinity” to describe a real value, what they mean is, “a number so big that it might as well be infinite”. And for this case, yes that would be true. The number of combinations above would be astronomical, and even running tests at an obscene rate, it would take a really really really really long time to complete.

But it’s not infinite. It’s a certain value. An integer.

The problem with classifying combinations as infinite, is it immediately puts the software in a box.

It’s the kind of box where people think, “This software is too complex to test because of all the data combinations, so I’ll just do the best I can.”

That kind of box is dangerous. Here’s why.

Defining Thoroughness

I would argue that you don’t have to test all possible combinations to prove that all defects have been found.

would agree this were the case, if every combination yields a unique defect.

But they don’t. There are wide swaths of data sets that:

  • work just fine,
  • are invalid by their very nature (i.e. huge input length when a length is clipped in the program), or,
  • all trigger the same kind of defect.

Instead what I was trying to describe in part 1 was that, the way code is structured, you can find defects of a certain type using certain inputs–there’s an example there that will help explain, so I won’t repeat it here.

Basically, you only need one combination of input to detect a certain defect. Another combination of input that exhibits the same characteristics as the first combination, will show the same defect.

Therefore, you only need to run one of that kind of combination, to find that kind of defect.

Defining Truth

Finally, a trend that I notice is more a philosophical one–software has to be proven defect-free in order to be defect free.

This, I disagree with. The state of something either is, or isn’t, without our proving it to be the case. Something isn’t true because we’ve proven it to be so.

The same is true of software. Of all the many possible combinations of a piece of software, there can be one that is defect-free. Maybe more.

But it still goes back to what we’re actually proving: If you were to run through all possible combinations of inputs, an accidental result would that you’d find all the defects.

But I don’t think that should be a side effect. I think it should be an intentional goal. Testers should seek out and find defects.

Side Effects of This Thinking

I think this approach to software testing will have some interesting side effects. Particularly, I think the way we’d test would look a little different:

  • Testers would be a lot more familiar with the code they’re testing–I’m a heavy proponent of white box testing, and insist on being able to see the code being tested. If done right, white box testing won’t taint your understanding of the code, as long as you approach it with a healthy amount of “O RLY?” when making sure the code does what it says it’s actually doing.
  • Fewer redundant tests would be written–much less testing of combinations for the sake of thoroughness would be done. Why run the same kind of test to find the same kind of defect? Stop that.
  • More targeted tests would be written–if you could come up with one combination of inputs that would trigger a failure in the presence of one bug, wouldn’t you write it? I would. A large amount of these kinds of tests would be a valuable regression suite.
  • Each test would tell you something new and unique about the health of the system–the more information the better.
  • More questioning about what you’re actually testing for, when writing a test–I see many testers (and I’m including myself) get stuck in the trap of slamming out tests without considering what it is that’s being tested for. Stop. Take a second, and ask yourself, “what kind of bug am I hoping to find with this test?” Take some time and classify the bug (aside: I wonder what all the classifications of bugs even are? Intriguing…). And then write one test to find that bug in that part of the code.
  • More variety of tests–there are many flavors of bugs out there. Not all of them are related to data. Some are security related, or load and performance related. The fewer tests you spend writing overall, the more time you have to write a higher variety of them.

Many times, I refer to testing like setting tripwires. Either that, or a spider web, would be a good analogy.

If placed properly, a tripwire will warn you when something enters a room. If built properly, a spider web will catch flying insects (bugs, lol)

If placed improperly, a bunch of tripwires will fire off when one thing enters a room. Too much. Why have multiple failures of the same kind when only one will give you the same information? Are the other tripwires adding value? If not, get rid of them. Better yet, place them elsewhere.

If built improperly, a spider web will make for a starving spider. I would imagine that instead of “coverage” like a good web provides, it would look like a really thick single strand of spider silk. A lot of work to catch bugs along a single path is not as good as a large web that covers more area. It’s delicate, but each strand has capturing power, and all it takes is one…

Conclusion

As I said at the beginning, I hadn’t realized how strongly many people felt about what I said last time.

However, I know that ideas can propagate in different ways. I know that very few people have done the arduous work of researching, documenting and then publishing their findings about whether software can be defect-free.

A larger, second group of people will read and agree with the researchers. An even larger, third group of people will listen to suggestions made by the second group, and agree because it sounds reasonable.

But this doesn’t mean the first group was right to begin with. 

More than any other profession, it seems that testers are asked to “think outside the box”. Whether it’s true or not, I’m sure you would agree that “software can never be defect-free” is a type of box. It’s a constraint that we have to (or maybe not have to) operate within.

Even if I’m wrong, I think we can safely think outside the box on this, and come up with radical new ways to test.

It’s my hope that parts 1 and 2 have challenged you to think critically about the testing craft, and also encourage you to try out some crazy stuff once in awhile.

Thanks again for reading,

– Fritz

Advertisements

12 thoughts on “Rebuttal: “My Software is Free of Defects!” Part 2

  1. I didn’t read part one but will when I have time.

    I find myself in complete agreement with almost every thing you said here though. Where I disagree it tends to be about choice of words more than the message.

    Don’t get cocky, but you have demonstrated deep insight into a number of issues. Not all software related.

    I am a fan of wide ranging random test done periodically. These tests can be subdivided to avoid resource issues. At the same time, resources can also appear limitless. Depends who you work for. An open question regardless.

    I disagree on one issue. In areas like stereo systems and now phones there have been advocates for greatly reduced complexity. This also has limits but brings the questions you address into into the realm of the answerable.

    It’s interesting that some of your discussion has analogies in both philosophy and religion. Some questions are open by their nature and I really enjoyed your comments on what constitutes infinite.

    For a lot of the balance issues we encounter terms such as “for practical purposes ” and much of the rest is subjective and involves conjecture.

    I do like the way you think though. Keep it up. The thinking.

    Like

    1. mmm, vindication. om nom nom nom nom 🙂

      Thanks for the kind words. Hopefully there was nothing in the post that came across as cocky, as I strive to not come across as a know-it-all. I’m definitely not–there’s too much to learn!

      Very glad to hear there are people pushing for simple. Every place I’ve been has taken complexity as a given, and we just suck it up and deal with it. But yes, simplifying is a great way to increase chances of success.

      When we’re stuck between a rock and a hard place, we can move the rock, but simplifying the code lets us move the hard place too.

      Can’t wait to try this tactic on a product owner 🙂

      Like

      1. I had to look up nom nom nom. I keep laughing when I see it in my email.

        The cocky reference is an older expression that implies what might be overly glowing praise is meant to be legitimate. I can be a very harsh critic. And you weren’t being cocky kid. 😉

        So enjoy the moment.

        Nom nom nom…

        Like

      2. So I waded through part one of your rebutal and of course enjoyed the scolding you got at the end. (scoff )

        If you want to read old books try Hofstadter. He likes the mathematician Godel. Who to paraphase, proved that all mathematical systems complex enough to generate proofs can be used to generate contradictions. Meaning proofs generated by these systems can’t be trusted absolutely in the first place.

        So if someone wants a mathematical proof of a test or piece of software they are kind of doomed to infinite regress. And we all love infinity.

        These complaints don’t really undermine your own aurgument, which ironically is about sound test design. About examining the code.

        Further, your own concerns about completeness are to a degree limited in scope. If I’m selling your software my concerns are only partly addressed by your rebuttal. Nobody in my world, the real one, gives a rats bum about proofs and that won’t get you off the hook when a bug shows up.

        What I mean by this is I don’t want to hear about whose fault the bug is or see your proof. I want you to detect the bug before it occurs and then tell me whose fault it is.

        From this point of view your strategy is essential, and the other gentleman may well have a point. However…

        We return to an infinity. An infinity of platforms and configurations that might have their own defects. Or defects that are introduced by replacement parts and updates later. Things happening on completely different seemly unrelated levels.

        And buddy is worried your test demonstrates completeness or you don’t get it somehow.

        Sheesh…

        Like

  2. More feed back… I’m sure it’s clear to all you are addressing a complex systems problem as well as the scope and complexity of in depth testing.

    Complexity has it’s place and is essential in advanced systems design. However it is something to be used as needed and only as directed. Consult your technician before exceeding the daily recomended dosage.

    I will give an example of simplicity separately…

    Like

  3. I like the depth of consistency and simplicity employed by MS.

    All of my research is browser based. I can download documents of various types and save pages to disk in different formats.

    In addition, my bookmarks live in the same directory structure along with documents I have written using other apps or obtained through other channels.

    My needs to create relationships or associations take the form of links. Directory shortcuts. You will note this can also address cross cutting concerns.

    What is more, my attack surface and number of vendors (and external entities) has been greatly reduced. I might also reduce my training costs, support, liscencing; who knows?

    I did at least reduce both my complexity and frustration while saving time. Why it just makes me want to give Bill Gates a big hug.

    Has this never occurred to any other browser vendor? I think not.

    Now if I could just secure my browser life would be grand.

    Feel free to reuse this example if it helps when your client starts giving you “that look…”

    Rebuttals from well read geniuses telling me to read more books are always welcome.

    Regards, Dave Horsman

    Like

  4. You know I still haven’t read part one. It’s what makes my life such fun.

    I wonder what all the fuss was about…

    Should be interesting… Or maybe just annoying. I’ll let you know.

    Like

  5. I say I do see a problem here though. You see I thought this blog was a rebuttal to the previous blog you wrote.

    It turns out that isn’t true. This blog is a continuation or qualification of a rebuttal you wrote to another article! Which I haven’t read yet.

    Well, I promise to read it too and I will let you know what I think.

    Having said that, I do feel your test strategy is quite adequate. If the client does have additional resources tell them to invest in real time problem collection in the field. Into rapid response to problem reports and communicating to the user comunity that you care about them and this unanticipated bug. That you are fixing it right now and on their side…

    Crowd source your testing?

    Which is why I always want to kick Bill Gates in the butt, after I hug him.

    Anyway my bad. I will read the original article now. It turns out it wasn’t what I didn’t know, but what I didn’t know I didn’t know that was the problem. But any psycho politician could tell you that…

    On the other hand, we sometimes learn more from going bottom up, thinking outside the box, externally even. Maybe in the end I will learn more by reading in reverse; by thinking outside the feedback loop. Who knows?

    Regards, Dave Horsman

    Like

  6. OK… read the article. Didn’t really learn anything about testing or resources. As you suggested completeness is constrained by resources.

    So the question is can you make a defect free claim without testing until the universe ends. Yes you can. Do it with confidence!

    Of course make sure your tests are exhaustive. Test for things that are outside the unlimited scope of this tiresome mathematical proof.

    Like what happens when some other piece of software or internal module throws an exception. Not even covered here…

    Make your claim though. It’s missing the phrase “for practical purposes.” At the same time your competition won’t add that either.

    Develop a rapor with your client. Educate them about these issues. If they that. They are the boss.

    Just make sure you do deliver. Be the best tester. Help create the best software. Provide a solution that is more complete and a responsiveness more relevant than the proofs and debate you are being drawn into here.

    Regards, Dave Horsman

    Like

  7. Given that the lack of rebutalr to my rebuttal to your rebuttal I would add an additional comment for your consideration.

    The concept of design proof is a sort of stack popping where the debate of “provable” has merely been “shifted” to a higher level. It’s highly subjective at that level as well.

    Having said that, rigorous review of design is essential. Design is the foundation for code and within this analogy good code can only be upon solid design.

    “The code is the design?”

    In RAD / Agile terms it can be. If we are looking at a simple app or isolated component we can use this paradigm, increase the number of iterations and both suceed and save money.

    I’ve lost track of the number of times I’ve written a class or prototype suspecting or knowing the code would be rewritten or discarded. Quickly and cost effectively.

    However when writing complex systems, or at least at the application it is essential to “have a good plan” in order to produce a quality product in a timely manner.

    Can you “prove” the design? Not economically. Not mathematically.

    However there is a methodology for generating incomplete proofs. I refer to it as “The Robust Customer Beta” proof. Which was the original crowd sourcing approach. Annddd it was ALWAYS cool!

    In house, or inside closed domains, there are parallel systems (resources?) as well as the idea of parallel writes.

    Preserving both data formats requires less resources and has some vague hope of proof. It’s easy to phase out or turn on and off. So compared to parallel tests its at least doable.

    Parallel tests can then be limited or periodic even. Lots of options. Easier to demonstrate to both clients and managers.

    Is a client fixated on formal proof going to haggle over these resources? If so, get a new client or job; you have a problem on your hands. One you can’t solve never mind prove.

    AND.. did I forget to mention that “The data IS the design?”

    Gee, are we talking methodology, paradigm or philosophy? Ah… Who cares anyway!

    In closing I should add that Word Press can’t copy right this. That’s because you can’t copy right either the public domain nor common sense.

    Like

    1. Just waiting for the dust to settle 🙂

      Haven’t seen anything to disagree with, really. There’s a lot of good insight here, and it’s just going to take a change in perception and culture if this idea’s going to go anywhere.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s