Duplicate Running Job Bug

The more I test, the more I find real-world examples of defects that have parallels in the Software Testing world.

Software is getting incredibly complex, and I have a strong theory that testing for certain kinds of bugs is better than testing for code coverage.

So I decided to start up a new category called Taxonomies, to have examples of these bugs, and to explain how they are like other types of bugs found in software.

The Problem

In the break room today, I noticed the coffee pot overflowed. There was coffee all over the counter, all over the burner, under the coffee pot… it was a mess.

This is the type of pot meant for industrial or professional use–rather than have to add water, the water is fed in from a water line. All you have to do is add a filter, add coffee, and push the Brew button. 4 minutes and 28 seconds later, you have fresh coffee.

You’ll have even more if, for example, you hit the Brew button twice.

For whatever reason, the manufacturers of some brewers (and there’s more than one, since this has happened at 2 different clients now) will not check first to see if there’s already a brew happening, if the Brew button is hit. It just stacks up the job, so to speak, and brews another pot’s worth of coffee when the first one is done.

Result: A mess of coffee all over the place.

Similar Bugs in Software

One type of place where this kind of bug can show up if you have jobs that are scheduled at a particular time. I’m thinking of cron jobs specifically here, and this is one example of probably many more that you can think of.

It’s easy to set up a job to run every set period of time (i.e.: for one hour), that can manipulate data.

What becomes difficult is troubleshooting when another instance of that job spins up and tries to do the same work, when the first job hasn’t completed yet.

It’s a case-by-case thing, but if your job assumes it’s the only one running, then it’s only natural to assume that a second instance running will cause problems.

A solution for this type of bug is to have the job look to see if an instance of itself is running, before committing to actually running the job.

In the case of coffee, this would result in much less mess–if a person hits the Brew button, simply check to see if a brew is already going, and if not, just don’t do anything. Finish brewing the current pot.

In the case of software, don’t run the same job again, but you may want to send out a message or something, so that people can take appropriate measures–for example, why is the job running longer than expected? Is there more work to grind through? Is the first job hung in a loop? If so to either, why?

Where else can you apply identifying and preventing this kind of bug?

–Fritz

Advertisements

One thought on “Duplicate Running Job Bug

  1. – backup/restore procedures
    – sample-down data
    – establish environment using tasks organised on different level of abstraction

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s