Our new consolidated blog and downloads site for the two books – How to Measure Anything and The Failure of Risk Management – was a little problematic for a few new registrants. Anyone who needs to retrieve a forgotten password may run into a problem. If you do have any problems at all, DO NOT HESITATE TO CONTACT me directly at dwhubbard@hubbardresearch.com. Thanks again for your patience.
Douglas W. Hubbard
Doug,
On page 134 I think the std dev is calculated incorrectly. Step 2 divides the variance by the sample size and then takes the sqrt. Should it be the sqrt of the variance? The std dev being 0.187?
Paul,
Actually, that’s a common confusion. The method you would propose simply gets you back to the st dev of the sample not the st dev. of the estimate of the population mean. The sample variance is what is produced by the end of step 1 and the variance is the st dev squared. What we need is the st dev of the estimate of the mean of the population. To get the st dev of the estimate of the mean of the population we have to divide the st dev of the sample by the square root of the sample size. Alternatively, we can divide the sample variance by the sample size and take the sqrt (the approach I took).
Thanks for your input,
Doug
[Not sure where to post general questions, so posting here]
I am a little confused about the Value of Information calculations – hopefully you can clarify?
Firstly, in your marketing campaign example, the EVPI is $2M because that’s the value of EOL of the default decision (i.e. to approve the campaign).
Consider another scenario in which there are three possible outcomes (rather than just the two) as follows: outcome A would result in a net gain of $40M, B would result in a net gain of $10M, and C would result in a net loss of $20M. The probabilities of these outcomes are: A – 10%, B – 30% and C – 60%.
In this scenario, the “default decision” based on the currently available information would be to reject the campaign. Given that default decision, what would the EVPI be? I assume that it would be $7M (=0.1×4 + 0.3×10) – i.e. the sum of the EOLs for each of the positive outcomes.
Similarly, I assume that the EVPI if the currently planned decision is to approve the campaign is $12M (0.6×20).
Are these results correct?
If so, then when we are looking at the EVPI for a RANGE, should we only total the products for each slice that is within that part of the range that is on the side of the threshold that corresponds to the decision that we are not making – i.e. those that are between the Worst Bound and the threshold. So, in the range example in the book, should be EVPI be calculated by totalling ALL of the products for ALL segments, or just for those segments in range B (as shown in exhibit 7.2)?
Secondly, you go on to talk about more complex situations in which there are multiple uncertain variables (initial cost, long-term cost, benefits realised, etc.). In these cases, how do you determine the EVPI of any one of the variables given that the threshold of that one will depend on the actual ultimate values of the other variables? For example, I can’t say what the “break-even” point for (say) initial cost is if I don’t know what (say) the actual ultimate benefit will be. Nor would I be able to calculate the OL for a given segment. Does this mean that one should just use the most likely values for the other variables when calculating the EVPI for a given variable?
Thanks in advance and apologies if I am missing something obvious in these!
MarkH,
You don’t seem to be confused at all about the theory. Your example is correct and the book says what you just said. When computing the EVPI of ranges, you do not work out the expected value of all outcomes in a range – only the expected value of the losses given a particular course of action.
Suppose our problem is about betting on a new technology that will improve productivity by 3% to 11%. The threshold is 4% (below which we lose money on the investment) and we decided to invest. We only need to work out the probability-weighted total of the possible values below the 4% threshold. This is how it is explained in chapter 7. And if you look at the spreadsheet provided online for the Chapter 7 example for the EVPI of ranges, you will see that it does exactly this.
Thanks for your input,
Doug Hubbard
Thank you very much for your swift response – reading back, I guess that I hadn’t quite clicked that the OL for the midpoint of each segment over the threshold will by definition be zero, so that was the source of my confusion.
I wonder if you could also comment on the second part of my question regarding multiple uncertain variables?
Also, as I have been thinking through this, there is another scenario that I am not sure about: suppose I had to choose between 3 possible courses of action, A, B or C. Based on current information, option A as a 60% change of resulting in a gain of $10M and a 40% chance of resulting in a loss of $2M; option B has a 55% chance of a gain of 20M and a 45% chance of a loss of 12M; option C has a 70% chance of a gain of 2M and a 30% chance of a gain of 1M (no possibility of loss with this option, but limited potential gain). How would I calculate the value of information I might gain from a study that reduced the uncertainty regarding option B? Would that be 12Mx0.45 + 10Mx0.6 = 11.4M?
Hi Doug,
I was just wondering if you are going to be able to respond to these questions?
Thanks.
Sorry for the late response but – regarding the earlier question – there are two methods for computing EVPIs in multivariate cases. First, EVPIs for each variable can be computed holding the other variables at thier “expected” (i.e. probability weighted average) values. This is simple but it is also slightly oversimplifying. I use this method as a first-draft in the EVPI calculations.
Second, we can also compute EVPIs for combinations of variables by running Monte Carlos where we pretend we made the best decision assuming we knew exactly the “real” (in this case, the randomly generated) values in each of thousands of scenarios. The EOL is computed for the decision after running the MC and it is compared to the EOL of the decision if you didn’t know those variables. The difference is the EVPI of that combination of variables.
For example, if you have an investment with variables A through Z, you choose your best decision under the current state of uncertainty, and then run a MC. Suppose your best decision was to reject the investment. But when you run the MC, you find out that sometimes this turns out to be the wrong decision an you lose the benefit you would have made. Then you run the model where you get to change the decision assuming you know, let’s say, variables A, M and Q exactly. Knowing these three variables, you would sometimes make a different decision. If these were important to the decision, then you would have made the wrong decision a little lesss often. In other words, the EOL of the investment is lower knowing these three variables exactly than if you didn’t know them. The difference between these EOLs is the “joint” EVPI of A, M and Q. This is more realistic and allows grouping of variables together for a measurement. The joint EVPI is often very different from the sum of the individual EVPIs (the value of measuring them assuming you would measuring only that variable alone).
Regarding your most recent question, the EVPI calculation, again, starts with your “default” decision you would make if you had no more information. But I have to ask, are they mutually exclusive and are the outcomes completely uncorrelated to (i.e. independent of) each other?
Hi Doug,
Thanks for your response.
For the multivariate cases, I understand the first approach you outlined based on using the “expected” values for other variables, although I had guessed that it was probably a bit of an oversimplification.
Unfortunately, I can’t quite seem to get my head around how you use the MC to calculate the EVPI for a variable (or group of variables) in a multivariate problem. My confusion is, I think, based on what would actually be different in the second MC run. You say that in your example you assume that you know A, M and Q exactly – however, my understanding is that in a MC you “know” the values of ALL variables (A through Z) exactly, so I’m a little confused as to how this would be actually reflected in the second run? Maybe it might be helpful (for others as well as for me) if you could possibly include an additional sheet in your “Chapter_6_Monte_Carlo_2ed1.xls” spreadsheet to show how you would use a further MC to calculate the EVPI of (say) Maintenance Savings?
On the other question – the one that I asked about the 3 possible choices, A, B or C – this is a hypothetical scenario in which I am assuming that the three choices are mutually exclusive and that the factors that would influence the success / failure of each are all independent. I also assumed that, given the initial information, B would be the default choice. Part of the EVPI calculation for B is, therefore, the possible loss of B x probability of B failing (i.e. 12Mx0.45 = 5.4M). What I’m not sure of is what should be added to this to calculate the full EVPI for B – should one just consider the potential gain of the “next best” decision (in this case option A), or should one consider all the possible gains and losses of all the other alternatives (in this case A & C). i.e. Should I add 10Mx0.6 (giving a total of 5.4M + 6.0M = 11.4M as the EVPI for B), or should I add [10Mx0.6 – 2Mx0.4 + 2Mx0.7 – 1Mx0.3] = 6.9M which would give an overall total of 12.3M? Hopefully this makes the question a little clearer.
Thanks again for your help so far – I hope I’m not taking an unfair amount of your time on these questions, but, if so, please feel free to say so!!