Errata Statistics

Although my publisher assures me that some errors always make it through the proofing process, each one is still frustrating to the author – mostly because the author had the chance at some point to catch almost every one of the errors.

My wife teaches math at a local community college and had taught math in a high-school for many years. She tells me that every text book she ever used had an errata sheet and that some had up to three pages of errata. In the proofing, we find scores – even hundreds – of errors, so it must be likely that some will get through. Can this likelihood be computed? If you read my book, you would say “Of course!” So I started thinking that if several people each find a number of errors over a period of time, I should be able to estimate the undiscovered errors.

One method I discuss in the book talks about methods for problems like this, including the catch, release & recatch approach for estimating fish populations. If two independent error-finding methods find some of the same errors but they each find errors the other did not find, then we can estimate the number that they both missed. I mention in the book that this same method can apply to estimating the number of people the Census missed counting or the number of unauthorized intrusions in your network that go undetected.

I had another method that I considered including in the book but, in the end, decided to leave out. This method is based on the idea that if you randomly search for errors in a book (or species of insects in the rain forest, or crimes in a neighborhood) the rate at which you find new instances will follow a pattern. Generally, finding unique instances will be easy at first but as the number

Newsletter

Categories

New Webinars

Newsletter

Measure What Matters,
Make Better Decisions