The Psychotic State April 9, 2001 Numbers Make Me Nuts Balanced on the knife edge of being a raving lunatic, as I am, it is important to me to keep a catalog of things that make me crazy. Keeping this list is a critical part of my everyday life, because in its absence I find myself stumbling into situations in which I lose all social control. I have a bad enough reputation as it is. When I say that numbers make me nuts, I don't mean to blame the numbers themselves. Why, for a geek like me numbers are a some of my best friends. No, it's not the numbers that bug me, it's the way people react to them. For most people, a number is a weighty thing. It's freighted with authority. For example, take the sentence "There are 1,537,195 people living in Manhattan". Most folks, hearing this, think that they have understood some sort of truth - for example, they think they know how many people live in Manhattan. I'm obnoxious, and that's not how I respond. I immediately want to ask: "What do you mean by a 'person living in Manhattan'? How did you count? What could be wrong with that number?" Actually, this is not an abstract question: a few years ago the Supreme Court ruled that the only way the US Census could determine the number of people living in Manhattan was to actually _find_ all of them, which strikes me as a particularly bad way to go about it. (For example, you might try to look at how many buildings there are, or much water is used here, or how many pounds of potatoes are bought - the best way is doubtless to look at a combination of methods.) But the Supreme Court is populated by Republicans, and it turns out to easier to find the people who live in Republican districts, so that's the way they want it done. As bad as that is, it gets worse in business. Politics and law are mostly about power, and the numbers are always secondary. Business is about money - wealth measured in numbers - and so the danger to my sanity is all the greater. I'm always nervous around a spreadsheet. Double click it and who knows what craziness I'll find going on inside. If the sheet is just a column of names and phone numbers I'm probably safe, but I'm always sure to take a deep breath and relax before opening one containing a Return On Investment calculation. Even someone as crazy as me knows that you have to have some basis to make decisions. I'm no expert, but it seems to me that if you're in business, you probably want to do those things that will cause you to make money, and eschew those that cost more than they bring in. I know that this is controversial (at least it was last year), but that's the way it seems to me. The devil, as usual, is in the details. Open a typical ROI spreadsheet, and you'll find a lot of very authoritative looking calculations, with various "what if" parameters that can be adjusted. Depending on how you fill in the parameters, the spreadsheet will predict how well you'll do in various scenarios. What is interesting is that I've never seen a spreadsheet that was honest enough to contain an error bar. Remember error bars from high school? In science, they're the key to truth in advertising - in which every number advertises its limitations. If the weight of a chemical in a reaction is 30+-2 grams, that means it has a roughly 2/3 probability of being between 28 and 32 grams. You aren't supposed to take it more seriously than that. The Supreme Court doesn't understand this sort of thing, and neither does the average ROI spreadsheet. No, instead one deals with numbers that lie - they don't tell you just how much they should be believed. An example will clarify: Suppose we evaluate a project that has a single cost C = $10M plus or minus 6M. The benefits are estimated to be B = 16M +- 10M. (Given that the estimates are generally absolutely wild guesses, these sorts of error bars are in no way unreasonable for the sort calculations of typically made. In fact I think that they may be optimistic.) If we assume that the two errors are uncorrelated bell-shaped distributions (probably a very bad assumption) then the total net benefit is B-C is 6M +- 12M. (The error on B-C is the square root of the sum of the squares of the error on B and C.) OK, you're a manager, and you're given a spreadsheet that tells you you're going to make $6M if you do something. That means you should do it, right? Wait, think again: what _really_ are you supposed to make of cell D45 on workbook 1? Look under the hood at the error bars and you see that in fact the number hasn't told you much - the decision about whether or not to undertake this project is not determined by the analysis. The spreadsheet should say so, but it doesn't. Usually, no effort was made to estimate the error, so what's hidden is actually the most important thing: that the actual number is somewhere between -6M and 18M. Maybe you ought not to do it after all - especially if you aren't sure you have millions of dollars you can afford to lose. In sum, the estimates of mean return are going to be deceiving quite often - and the more variables that are put in the models the bigger the errors are going to get. This means that a "more careful" analysis might not even help. (And furthermore the more "careful" one gets, the more need there is to focus on putting "values" on intangibles. If I am trying to decide whether to buy car A or car B, I suppose I might do it by looking at the total cost of the car, including the present value of the payments I have to make to run the two models, attaching "prices" to the benefits of the features of the two cars, and optimizing. I think I am safe in saying that virtually no one buys cars this way, but evidently people think that this is a good way to buy software.) OK, so what should you do? Well for starters, if you're building a model like this, you probably should at least make some attempt to determine what the limits are to what you're claiming when you use a number in this way. Try attaching an error bar or two to the calculation, and see what comes out. Even better: you might simply take an even bigger-picture approach. Focus on areas of major _risk_ and opportunity cost - either by doing the project or not doing it. "Here are the big unknowns, here's what could go wrong either way, let's focus on those." If you just work through the analysis that the spreadsheet embodies, you might actually learn something beyond the numbers. You might even discover the real reason to do the project. And you might realize that numbers alone really can make you nuts.