Non-Bayesian Decision Theory

Mathematician Itzhak Gilboa on known and unknown probabilies, various descriptive models, and ambiguity

videos | July 3, 2015

What influences our decision when we assess the probability of some event? What is there besides simple mathematical laws? May there be an absolutely complete mathematical model? Mathematician Itzhak Gilboa describes thought experiments which show the insufficiency of Bayesian decision theory.

So we have the Bayesian approach that says that any uncertainty can be quantified or can and should be quantified, and whatever it is that you do not know you could assign probabilities to. And this got a very important support from the work of Leonard Savage in the 1950’s. Savage had probably in mind certain limited state spaces or spaces over which have to assign probabilities, but the evolution of economic theory was such that it started to apply to everything and anything.

John Harsanyi, who won a Nobel Prize in 94 for his contribution to game theory, helped economics deal with situations where we don’t know what the utilities are. And it’s a little bit surprising even for me that I’ve been around for 30 years and I didn’t understand what the contribution was, but apparently this was a breakthrough that waited for him. And basically he said: “Well, if you don’t know what the utilities are, we should think about it is a problem under uncertainty”.

Imagine you go back before you were born and then you don’t know whether you’re going to be born into this type or that type, into these types, and then it’s just a problem under uncertainty – we just going to go back when you’re still in the womb, before you were born, and then assign probabilities to that and apply the Bayesian approach there. And then what we now call Nash equilibrium is becoming Bayesian Nash equilibrium because you doing Nash equilibrium, namely we’re acting optimally to what the others are doing, but before you know even who you going to be born into. You know this by Bob Bauman, who pushed this even further by putting even more things into the state’s basis that describes everything that could possibly have happened in the past and knowledge partitions, what everyone knows what everyone knows what everyone knows, etc.

Radiologist Jeremy M. Wolfe on search strategies in our everyday life, amnesiac search, and false alarms at medical examinations
And at some point the Bayesian approach is being applied to the state space that is so large and so informative that you ask yourself how on Earth would I be able to assign probabilities to that? Now David Schmeidler, who used to be my advisor and colleague for many years and a good friend, started in the early eighties criticizing this approach. He actually started with an example, when he talked to his friends in the stats department and said: “Suppose that the I’m offering you to bet on a coin, and the coin can be one that I pull out of my pocket, which you’ve never seen, or the coin could be pulled out of your pocket, which you played around with and you tested. And you know that your coin is fair which is about fifty-fifty, and about my coin you know nothing. Now, knowing nothing, if you have to assign probabilities to it, because you’re committed to be Bayesian, you’ll say 50/50 out of symmetry, probably, something like Laplace’s Principle of sufficient reason. But doesn’t it feel different? I mean, having 50/50 that is based on statistics and the 50/50 that you arrived at by shrugging your shoulders? I don’t know that.

This is basically the experiments that Daniel Ellsberg suggested, not really conducted, but sort of thought experiments, that Ellsberg did in 1961, he published his paper. And he was talking about, let’s say, two urns: one urn in which you know that there are 50 balls that are red and 50 that are black; another one you have 100 balls each of them is black or red, but you don’t know what the composition is. Same kind of story: you pull the ball out of here, if you have to assign probabilities, you want to satisfy symmetry, because you have no better reason, then you have to do 50/50. But many people, when asked, say that they would rather bet on either black or red from the known urn with known 50% probability to either a black or red from the unknown urn. But there’s no probability that can justify that, because if you assign probabilities for the unknown urn, one of them would have to be at least 50 percent.

So, what David Schmeidler did was to suggest a more general theory having to do with the notion of probability that is not necessarily additive. So intuitively speaking, if you think about the probability of red or black, maybe both of them are less than half, but the probability of the union is one. And how could that happen? Well, not if you think about empirical frequencies.

If you think about your subjective probabilities, this seems to be describing your willingness to bet, if you prefer the known probabilities to the unknown probabilities.

So, there was a question of how do you make decisions with respect to that. The standard approach has expected the utility theory, which means you take an integral of the utility function based on a probability measure – we know how to do that. How do you do it in the context of non-additive ones? What David Schmeidler did was to use the notion of Choquet integration, that Gustave Choquet suggested in 1953-54 in the context of physics of electrical charges and so on. And to provide the axiomatic foundations for that, I mean, his followers including myself provided axiomatic foundations similar to the work of Savage or Bauman in the context of the classical theory for the case where you might not have probabilities that add up, probability of the union might not be the sum of probabilities.

This was related to another work that David and I did together, which has to do with them more than one probability, or let’s say multiple priors. It’s called the prior because all of this is before you get information, after you get information would be a posterior, but we’re going back to the prior. And there we looked at the model in which you say, okay, maybe instead of having one probably we’ll have a set of probabilities, because that’s sort of a classical, the standard situation in classical statistics. Classical statistics is about having a set of distributions and not trying to quantify which one it is. That’s the opposite to Bayesian statistics that has probabilities over probabilities, in classical statistics you have a set of probabilities and that’s it. And all the confidence of this hypothesis testing we doing is in this tradition – a set of decisions without trying to quantify over them.

Economist Itzhak Gilboa on the history of probability theories, predicting the behavior of people, and the links between decision theory and social sciences
And in our models there are again axiomatic foundations, namely, what kind of behavior or what kind of consistency of behavior would you see, that would be equivalent to – and in our case, what you do is you have utility function that is not given to you, but if you satisfy these axioms then it is as if there exist a utility function. There exists in this case not a single probability measure, but a set of probabilities. Now you make decisions, so for every probability they do an expected utility for a given option. In our model what comes out is the look of the minimum, so it’s called Maxmin Expected Utility in the sense that for every option I compute all the possible expected utilities, all the possible expected utility values, when they arrange over the probabilities, and then I take the minimum one, and that’s what I’m trying to maximize. And then, there are many other models that have been developed since, I don’t want to try to describe all them here, but there was definitely more than one model.

These things have been applied to a variety of phenomena that are relatively harder to explain with the Bayesian paradigm, including, you know, macro-phenomena, finance phenomena. Just one example that’s very close to the original Ellsberg experiment is what’s called in finance The Home Buyers, which is the phenomenon that when you look at how people trade they seem to prefer to be trading on equities, of their domestic equities of their own country as compared to foreign ones. And you might stop them and ask why, because according to finance theory the price already incorporates all the information so you have the same probability of going up and down otherwise people would buy or sell. So if you think that it’s gonna go up and down with the same probability of 50/50 then what’s the difference between the 50/50 of your home country and the 50/50 of a foreign country?

But then a couple of people suggested maybe that’s exactly the kind of Ellsberg phenomenon, namely, the foreign one is like the unknown urn, if you really insist I’d say 50/50 up or down just because symmetry, my ignorance is symmetric, but it’s other that we really know. Whereas when it comes to my home country and I know more about it, I know more about the firm, I know, I read newspapers about it, etc., maybe I feel that I know more about why it’s for the 50/50. But this is just one example of the many examples in many theories macroeconomics, and in labor economics, and other things that can be better explained using these kind of models.

Which model should you choose when there are many models that deal with this? By the way, the phenomenon is also called ambiguity, an absence of probabilities or Knightian uncertainty, because Frank Knight talked about it in the 1920’s. Ellsberg called it ambiguity, so the common assumption that people don’t like ambiguity, so it’s called Knightian uncertainty aversion or ambiguity aversion. Experiments sometimes find people that do like it, but most often people tend to believe that, I mean, the researchers tend to believe that people prefer to know the probabilities than not.

But there are many such models and people sometimes ask which one should you use. Typically, I’d say we don’t know, none of our models is correct, we know this.

I walk into a class in economics and then I tell the students: let’s agree, first of all, the theories are wrong, and all the models are wrong, and now let’s get started.

So the question is not if the theory is correct – we know it’s false – the question is it false in a way that invalidates the conclusion? Is it false in a way that is really important? So when you start dealing with uncertainty, I think the Bayesian model is and should remain in the benchmark – it’s a way you start analyzing a problem.

But if you see that things cancel out some expectation of X expectation of -X are canceling out, that’s a time when you should say: “Wait, wait a minute, I’m becoming a little bit suspicious”. If the qualitative result that they get depends on this canceling out, then maybe I want to see what happens with a model where it doesn’t cancel out, maybe I’m missing something which is qualitatively important.

And then it might not be that important which model to use, because what you basically doing is trying to see how robust the conclusion is. So if I have this cup of tea here and i want to see whether it’s stable, I can try to push it a little bit from the right a little bit from the left to try to test whether it’s stable. Probably it’s not so important whether I’m pushing it from the right or from the left – that the same way we play with models. There is some qualitative result we hope it tells us something about about the world, before we go on and really trust it, we would like to test how robust it is and then we can use this or the other model ambiguity or uncertainty, to see whether the the qualitative conclusion that we arrived at in the Bayesian model is really stable, solid, robust, or not.

The Chair for Decision Theory and Economic Theory, Eitan Berglas School of Economics, Tel Aviv University, AXA Chair for Decision Sciences, Department of Economics and Decision Sciences, HEC, Paris
Did you like it? Share it with your friends!
Published items
To be published soon