Neuroscientist Karl Friston on the Markov blanket, Bayesian model evidence, and different global brain theories
The Free Energy Principle originally emerged from systems neurosciences as a way, a principled way, of understanding what the brain does and how it does it. Subsequently, the principles proved to be so simple and so powerful that they have been applied in a variety of contexts. So one could almost regard the free energy principle as an organizing principle for any living system that shows the characteristics of life.
So, the reason I start like that is that there are two roads to explaining or understanding the free energy principle. You can either start from the perspective of people like Helmholtz in the 19th Century trying to understand unconscious inference in the brain and build a story through analysis by synthesis and psychology through to current and exciting developments in machine learning – things like Geoffrey Hinton’s Helmholtz machine. And then how that has become contextualized in the enactivist or the embodied cognition context. I’m generalizing these notions and you end up with the free energy principle or you can start from the top and just ask very simple questions about what it is to be alive? And, if you are alive and you exist, what sorts of behaviors must you show? And in fact, if you answer those questions you end up with exactly the same answers that you would have got had you followed the historical route.
For brevity I’ll take the high road. I’ll go from the minimalist assumption that things exist and then try and unpack that and show how one can get to notions of the brain as an inference engine sometimes called the Bayesian brain hypothesis. The brain as one of the best examples of an organ that is actively constructing explanations through its own sampling of the world. So, this inactive perspective is very important because not only does the brain then have to explain all the sensory input, it also has to choose which sensory input to sample. It is in charge of gathering information, evidence for its own predictions and own beliefs about the world. But I’ve jumped ahead so now I have to explain to you why is it that any system that exists will behave as if it has a model of the world and it’s trying to gather evidence for its own model of the world.
So, the story starts just by acknowledging that if you want to talk about something there has to be a separation between the thing you are talking about and everything else. And, if there were no boundaries there would be nothing. Because there would be no distinctions between the thing and not that thing. Statistically speaking that distinction or that boundary is called a Markov blanket. It’s just a mathematical way of separating states of some abstract world system: organism, culture, life, cell, brain into things that are internal to the boundary that are owned by that system and things that are outside the boundary that are external to the system. So, it could be a cell and its milieu, it could be a phenotype, it could be me and my environment. Well, at any scale there has to be this division. Now, the very existence of that separation, that Markov blanket, in conjunction with the assumption that that system exists over time tells you something quite profound about the behavior of the internal states and the states that constitute the Markov blanket.
This is a bit abstract but it is actually quite simple. The Markov blanket has two bits to it. There’s the sensory states that are just defined because they don’t influence the external states but they do influence the internal states. So sensory information, for example, would be mediated by sensory states as they get from the outside world into my internal world, my brain. And there are active states that go in the other direction. So, they influence external states but are not influenced by the external states. They are actually dependent upon the internal states. If I take me as a model of my world, my active states would be how I am currently moving, whereas my sensory states would be the activities of my photoreceptors, all those sensory organs and sensory epithelia I had at my disposal.
I can’t, I don’t have time to go into it but it is a beautiful observation that the defining dynamics of any system that does not dissipate over time is that they on average will move or their states will flow so as to maximize model evidence, Bayesian model evidence. So, that means that if a system exists then it will appear to maximize Bayesian model evidence, it will appear to be a little Bayesian engine. It will appear as if it has a model of its world. Why? Well, because that system, let’s now go back to the Markov blanket that comprises the active and sensory states and the internal states that are encompassed by the Markov blanket. The law, the rule which says that all of the states must maximize model evidence which is also known as marginal likelihood, that is also an inverse upper bounded by free energy, hence the free energy principle. All of those states have to maximize marginal likelihood or minimize free energy including action. That means, action and sensations on the internal states are all doing the same thing. Which means that we can understand the internal states say of the brain as modeling the world because they are maximizing the Bayesian model evidence for me or a model of the world. At the same time, my action is also trying to maximize the evidence for my model of the world. So, put very simply almost by definition I am in the game of garnering information that maximizes the evidence by my own existence and that’s basically the free energy principle. It’s a corollary or a consequence of any system that doesn’t dissipate, it looks as if it has to behave as if it is maximizing actively soliciting information from the environment and modelling that information as a model of the environment to maximize the evidence for its own existence. And that’s where we started with the long history of the Helmholtz’s notion of unconscious inference right through to modern day machine learning formulations for example, the Helmholtz machine of Geoffrey Hinton and Peter Dayan.
That can be unpacked at many many different levels and it has provided a very useful framework within which to understand how that free energy principle is complied with by the biology, and the anatomy and the physiology of the brain. What it tells you is that the anatomy of any system has to contain with it a model of the environment in which that system is immersed. Which means that if we live in a world that has some deep hierarchical structure, in which there is action at a distance, for example, so that the color of objects around me is determined by the instant light as it comes almost instantaneously to my eye, or a falling body is caused by gravity then my brain must recapitulate that causal structure and of course it does.
The very fact that we have nerve cells with long slender connections connecting each other at a distance speaks exactly to the causal architectures of the world that we inhabit have this action at a distance and this sparse connectivity. Furthermore, the hierarchical structure of the world is recapitulated in the neuronal structures that constitute the hierarchies of the connectome or the hierarchical disposition of functionally specialized brain areas.
You can go further, if the brain is truly a statistical model of the world it inhabits, can we understand some fundaments of brain organizations such as the distinction between what and where streams in the brain? So a very powerful observation, a principle of functional specialization is that where processing for a stream of brain areas roughly down here and a more dorsal stream is concerned with what. That may be a simple reflection of the fact that we live in a universe where different things can be in different positions. So that we can statistically separate the whatness from the whereness. If we lived in a universe where whenever something moved it also changed its nature, we couldn’t do that. So, just by looking at the brain I can tell you the sort of universe that you inhabit under the free energy principle, under the assumption that your brain has become a model of the environment that it inhabits.
The free energy principle has been quite useful from my perspective and that of my colleagues largely because it shows the connections between previous theories. There are many global brain theories that have been brought to bear. For example, the principle of minimum redundancy, maximum efficiency, notions of the brain extracting as much information as it can from the environment.
So, all I’ve said so far is that in principle every internal state, every action that I made, every sensation that I gather should be at the service of minimizing variational free energy or maximizing marginal likelihood. How? How do you do that? How does a brain do that? But, if you know what the objective function is, if you know what the process, what the imperatives are you can then cast it in terms of processes. For example, I can say: “well this minimization of variational free energy or maximization of Bayesian model evidence is a hill climbing or gradient descent algorithm. So, I can now write down a differential equation where everything, every neuronal state, physiological variable in the brain now becomes describable as a differential equation given other states in the brain. And, if that equation is true then I can now go and map the variables to physiological processes.
And if one plays that game you can go an enormous way in starting to understand not just the anatomy but also the physiology and also you can generate questions because there are alternative processes, that don’t conform with the same principle, so does the brain use sampling techniques to maximize model evidence or does it use hill climbing optimization schemes, variational schemes. So, you start to generate a whole testable raft of hypotheses pertained to the process theory that are all consistent with the overarching principle.