Originally posted on http://www.howtomeasureanything.com/forums/ on Monday, December 22, 2008 4:23:05 PM, by Dynera.
I have your book on order and I was wondering if you cover the measurement of preventation effectivenss. For instance, I was involved with a software architecture team who’s purpose was to vet architecture designs. We recieved several feedback forms saying that our input was useful but besides that we really didn’t see any other way to demonstrate our effectiveness.
Any feedback is appreciated.
First, my apologies for the long delay in my response. Between the holidays and site problems, it’s been quite a while since I’ve looked at the Forums.
Vetting architectural designs should have measures similar to any quality control process. Since you are a software architect, I would suggest the following:
1) Ask for more specifics in your feedback forms. Ask if any of the suggestions actually changed a design and how. Ask which suggestions, specifically, changed the design.
2) Count up these”potential cost savings identified” and count these up for each design you review.
3) Periodically go into more detail with a random sample of clients and your suggestions to them. For this smaller sample, take a suggestion that was identfied as one that resulted in a specific change (as in point #1 above) and get more details about what would have changed. Was an error avoided that otherwise would have been costly and, if so, how costly? Some existing research or the calibrated estimates of your team might suffice to put a range on the costs of different problems if they had not been avoided.
4) You can estimate the percentage of total errors that your process finds. Periodically use independent reviewers who develop separately assess the same design and compare their findings. I explain a method in Chapter 14 for using the findings from two different vetters to determine the number of errors that neither found. In short, if both vetters routinely find all of the same errors, then you probably find most of them. If each vetter finds lots of errors that the other does not find, then then are probably a lot that neither find. You can also periodically check the process by the same method used to measure proofreaders – simply add a small set of errors yourself and see how many are found by the vetter.
This way, you can determine whether you are catching, say, 40% to 60% of the errors instead of 92% to 98% of the errors. And of the errors you find, you might determine that 12% to 30% would have caused rework or other problems in excess of 1 person-month of effort had it not been found. Subjective customer responses like “very satisfied” can be useful, but the measures I just described should be much more informative.
Thanks for your interest.
I can’t find the explanation of the method you mentioned in your post in the chapter 14 of the book. The idea of the method is clear to me, but I just some more detail would be useful.
Thanks for the comment. Actually, the example I meant to refer to is in chapter 9 on pages 141-142 (the fish example). You use two different sampling methods to select from a population. Based on how many items came up in both samples, you can infer the size of the population. In the case of the fish, you cast the net two different times and look at the number of fish that were caught in both nets (you can tell which fish caught in the second net were caught in the first because you tagged them after catching them the first time.)
You can do the same with proof readers or code testers. Suppose you have two different vetters checking a document and one finds 12 errors and another finds 14. Furthermore, 10 of the errors were caught by both. This indicates that they probably caught most of the errors between them. On the other hand, if only 2 errors were caught by both vetters and they each found several that the other did not, then there are probably still lots of errors found by neither. Now, this assumes something about the vetters. This works if the vetters find errors more or less at random. But if the vetters use similar strategies it might be possible that they both find almost the same list of errors and there would still be many errors out there undiscovered.
While this method has some important restrictions and warnings, its still a powerful method for estimating something most people assume would be impossible to estimate – the number of things unseen.