How much do bugs cost?

Introduction

Except for the most trivial pieces of software, every program contains bugs. If bugs are a fact of life for a developer, it makes sense to try and learn something about why they occur and how much they cost the industry. It also makes sense to find ways to minimise this cost.

In this blog post, I'm going to examine these questions and hopefully answer a few of them.

What is a bug?

To understand what a bug actually is, first we have to come up with a working definition of a bug.

Whenever I write software for my company, the first thing we do before writing any code is to write a specification of the business rules the code is going to fulfil. So we could say that a bug occurs when the code that implements those rules does not match what is written in the specification.

Job done? Not quite, you see the specification is too high level to anticipate all the other requirements that come out of the rule. When you start writing code, you quickly discover that the small set of business rules in the requirements document explodes in to huge list of derived requirements that follow from trying to implement those rules. The "right thing to do" at this point would be to update the specification with all these derived rules. In the real world this almost never happens.

So the working definition of a bug is simply when a program produces an unexpected or undesirable result when it is executed. It's worth remembering the definition I gave above because later on I will return to specifications.

Why do bugs happen?

Almost all the bugs I've seen been caused by a failure to satisfy a pre or post-condition. What do I mean by this? Take following code snippet:

	public void SendEmail(Account account)
	{
		SendEmail(account.EmailAddress);
	}
	

There is an undocumented assumption here that the "account" variable is not null. We call this sort of assumption a pre-condition. It's a condition that has to be true to call the method. If it isn't true, then the method will fail. Whether you like it or not, code is absolutely littered with pre-conditions. Some are written in there explicitly, some are implied by the code contained in the method, as my example above showed.

Another type of a bug occurs when you fail to satisfy a post condition. A post condition is something that should be always be true after you've executed a routine. For example:

	public static void AddCredits(Account account, int amount)
	{
		account.Credits += amount;
		// account.Persist();
	}
	

The code is meant to add credits to an account and persist it to the database. The persist line has been commented out and as such, it never gets called. This is a failure to satisfy the post-condition of the routine. The Account should always be persisted after credits have been added to the account.

Code is also littered with post-conditions. When you call a routine, what you're basically saying is "If I fulfill your pre-conditions, will you do this job for me and satisfy the post-condition?" I'd say that well over 90% of bugs I repair are a result of a misunderstanding of what the pre and post conditions of various methods are. When a developer breaks the contract, the program breaks.

Aren't I just repeating the mantra of Design by Contract? Well, what I'm trying to say here is regardless of whether you think Design by Contract is a good idea, you already are dealing with contracts. Every. Single. Time. You. Write. Code.

The first step to writing code with fewer errors is to acknowledge these contracts exist. I support Design by Contract because it makes documenting these contracts part of the development process. That is going to lead to fewer errors because you're going to think about the methods harder and you're always going to be asking whether you've satisfied those conditions.

Why can't we just tell the developers to be more careful?

Given that bugs are so common, why do they still happen? Why can't we eliminate bugs completely from software by demanding our developers are more diligent? There are a number of different answers to this question.

It's important to realise that not all bugs are created equally. Some bugs can be anything from extremely minor to extremely major. For example, you could have a bug where there is an incorrect accent over a letter in the French localisation of your product which is probably not too urgent. At the other end of the scale, you could have a bug that takes down the entire computing system of a multinational company.

To write code that contains no bugs is not an option unless you have very deep pockets. Therefore, the focus of most organisations is to remove all the serious bugs and to leave the less important bugs. Is a bug that a customer never discovers really a bug, after all?

Given that bugs are inevitable, how much do they cost?

Bugs cost an absolute fortune. By some estimates, as much as $50 billion a year is lost is economic productivity through software bugs. That's a truly huge amount of money.

Studies say that the average developer can churn out code that has seventy bugs per thousand lines of code on his first attempt. Interestingly, this value appears to be independent of the language used. In other words, on average ten percent of the code developers write contains a logic defect. After many cycles of debugging, this code will contain as few as two defects per thousand lines.

Now lines of code is rarely a good measurement of anything; that much is true. However, it's clear there's an order of magnitude difference here. Regardless of how you define a "line of code" you're going to be able to confirm this basic fact.

To take the defect count down from seventy per thousand lines to just two takes an enormous amount of effort. You have to employ a whole team of quality assurance people to test the software. All of these people collect a pay cheque and for each of the sixty-eight defects they find will need to be fixed by the original developer. The bug has to be communicated, isolated and then removed. Then the code needs to be retested. All of this costs time and money.

Then there's the problem of the two bugs per thousand lines of code that remain.

NASA found in their software projects that the cost of fixing a bug found in the first stages of testing was roughly a dollar. Repairing that defect after the product was released cost a hundred dollars. NASA is a different software development house to most because what it writes has to be high quality, fault free code. Even so, it's safe to assume that the cost of fixing a bug reported by a customer is an order of magnitude greater than fixing it internally.

Given that bugs cost so much money, you think people would invest more time upfront trying to make their code less buggy? That would seem to be sensible, right?

How can I reduce this cost?

The problem with bugs is that you never really pay for them upfront. You pay for them after the code is mostly written. There is a disconnect between the cost of writing code and the cost of supporting the code you write.

Therefore, people believe writing buggy software is cheaper than writing quality software. This is simply not true, it is much cheaper to write higher quality software from the start than trying to repair buggy software.

This principle falls right out of the discussion above. The cheapest bug to fix is one that does not exist in the first place. If you can find a way to make the developer produce code with two defects per thousand lines right from the start, you can fire your entire QA department and save a whole chunk of cash. The problem is that very few people who run software projects understand this basic principle.

Reducing the costs associated with bugs is all about fixing the processes that create those bugs. Rather than fixing bugs once they occur, you should try to fix them before they even occur by writing detailed specifications. Once the code is written you should focus your attention on where the bugs are likely to be.

When talking about error rates per thousand lines of code, it is tempting to think that bugs are scattered randomly throughout the code. However, this is not true. A number of studies has shown that around 80% of the defects occur in 20% of the code. It stands to reason that the most error prone pieces of code contain the most complicated, poorly documented logic. Code automatically generated for object relational mapping is probably going to be the most bug-free, provided the templates are high quality. Thus large swathes of code are going to be relatively bug-free.

This insight is actually good news. It means that to remove most of the bugs from your software you only have to focus on a relatively small part of the over-all product. If you have a buggy module that just doesn't seem to work correctly, it may be better to document its interface and simply rewrite the module. IBM agreed. This passage is taken from one of my favourite books, Code Complete, and it speaks for itself:

Capers Jones reported that a focused quality-improvement program at IBM identified 31 of the 425 classes in the IMS system as error prone. The 31 classes were repaired or completely redeveloped, and, in less than one year, customer reported defects against IMS were reduced ten to one. Total maintenance costs were reduced by around 45 percent. Customer satisfaction improved from "unacceptable" to "good"

The best weapon against bugs is not technology. It is people and processes. It has been shown that formal inspections of software are known to remove 90% of defects before the first test case is run. This involves a group of people sitting around a table, scouring the source code of the latest product looking for defects. This is not a technical solution to buggy code, it's a people solution. Yet, it's much better at removing defects than any of the range of technical solutions out there.

It also sits well the theory above. If bugs disproportionately affect a certain part of the system, you can use formal inspections on those modules and ignore most of the code base. At my company, we have a good idea which areas of the code are the most buggy. I imagine this is true elsewhere and I expect that most developers know the parts of their code which are especially buggy.

Now formal inspection sounds like quite an expensive thing to do, but it isn't. Here's an excerpt from a paper called "Software Reliability: Principles and Practices:"

The average detection cost of bugs found by code reading was $10.62. The average detection cost of bugs found in normal testing activities was $251.60 ($196.70 of this was computer time.)

This paper was published in 1976 so let's be generous and say that the cost of computer time has now gone down to effectively zero. It's clear to see that it's five times more expensive to find a bug with traditional testing strategies than it is to have a formal inspection. If you're one of these people who believes that a study that old isn't relevant (and I'm not one of them) then here's a few more contemporary example of the power of inspections, all of which come from Peer Reviews in Software by Karl E. Wiegers:

The great thing about inspections is that you can use the data from the inspection to feed back in to the design process. Are you finding a lot of null pointer errors? Add that to the inspection check list. Are you finding errors that are occurring as a result of ambiguous specifications? Fix the quality problem of the requirements documents, an so on.

Given that inspections are the most powerful bug removal technique yet discovered, how come most teams don't use it? Well, first of all most aren't even aware that the technique is as powerful as the research shows. Those that do eschew the confrontation that inspections may bring. Developers have a big personal stake in their work and to watch it be picked over for problems is a traumatic experience for some.

Having developed software for five years, I'd urge any developer to resist getting defensive over your code. It takes a collection of small miracles to ensure your code isn't shit. Between huge schedule pressure, changing requirements and all the other interruptions most developers face you should expect to make errors. Occasionally some will be particularly egregious.

Even if you can't face other people looking at your code, do inspections yourself. Before you start writing code, check the work you did yesterday. Just the few hours away from the office before the self-inspection give you sufficient detachment from the problem to spot bugs. This technique will not be as powerful as getting other people to insepct your code. However, I've removed some particularly silly bugs by inspecting my own work. Even when done by yourself, it's powerful enough to reap rewards.

Conclusion

Bugs are a very expensive part of programming. The earlier you eliminate them the cheaper they will be to fix. The best way to find bugs is by formal inspection and if you find a routine or class that is especially buggy, go ahead and rewrite it.

In my view, buggy code doesn't come from lack of sufficient tools, it comes from the people and the process. As Gerald Weinberg famously said:

No matter what they tell you, it's a people problem.

I couldn't agree more.

Simon.

2007-02-04 08:34:57 GMT | #Programming | Permalink
XML View Previous Posts