The book shows that calibrated probability assessments really do work and it gives the reader some idea about how to employ them. But facilitating a workshop – with calibrated estimates or any other formal method – has its own challenges. Participants ask questions or make challenges about calibration or probabilities in general that sometimes confound reason. It’s amazing the sorts of ideas adults have learned about these topics.

Still, I’ve found that these kinds of challenges and my responses to them have become almost scripted over the years. The conceptions and misconceptions people have about these concepts fall into certain general categories and, therefore, so have my responses.

I thought about starting out with an attempt at an exhaustive list but, instead, I’ll wait for readers to come to me. Do you have any challenges employing this method in a workshop or, for that matter, do you have questions of your own about how such a method can work? Let us know and we’ll discuss it.

I’m wondering whether and how you combine a number of subject matter experts’ 90% confidence intervals?

I would think you would often want to combine the calibrated estimates of a number of experts to produce an aggregate best estimate of the 90% CI, but I did not see this discussed in ‘How to Measure Anything.’

I do mention methods to aggregate multiple experts but I added more in the second edition. In the first edition, I mention the Lens method and prediction markets – both of which can be used to aggregate many estimates. For 90% CI’s, you are effectively making two estimates, for an upper bound and a lower bound. Research by Robyn Dawes and Bob Clemen support that this does improve estimates.

The simplest method for aggregating multiple experts is to simply average all the estimates of the lower bound to get the aggregate lower bound. Then average all of the estimates of the upper bound to get the aggregate of the upper bound. There are some problems that this could create that would be avoided by prediction markets, especially if there are very large differences among the expert estimates. Prediction markets can be used to offset these issues, but there is, again, a simpler approach. If a set of experts have wide disagreement, you might try a Delphi technique. Extremely large differences are probably due to differences in information, understanding of the questions, or different implied assumptions. If they can share some of these, they may converge.

Thanks for your participation on the blog,

Doug Hubbard

Have you ever mentioned the Classical Method by Roger Cooke, et al? His method captures experts’ accuracy and informativeness on a set of ‘seed’ (known) questions, and optimally combines their scores into a single synthetic expert. The method satisfies a rigid set of statistical criteria, and is transparent, repeatable, and objective.

Doug:

In an answer to KDR, I would suggest some sort of optimization, whereby the individuals whose estimates were best (a posteriori) are given higher weightings.

Hi all,

I’ve been playing around with the ‘expert’ module for R (see http://journal.r-project.org/archive/2009-1/2009-1_index.html) which has multiple methods coded for the derivation and combinations of distributions. Sorry that it’s not for Excel, but I believe there is an R-Excel plugin.

Regards

Les