Brain Imaging
Neuroscientist Karl Friston on different types of brain measurement techniques, people's reaction to emotional...
As we go back to the origins of neural networks, we discover that their origins go back to the end of the 19th century. At the end of the 19th century, the first measurements were carried out on the electrical signalling activity of neurons on living animals or in the lab with tissue which is still fresh and still able to conduct its normal physiological activities in the lab environment.
So, the first measurements showed that neurons do not have a deterministic behaviour: the same neuron with the same input will not react twice thrice or four times in exactly the same manner. Furthermore, the sequence of signals that a neuron generates to speak to other neurons or to sensory or motor organs does not occur in a deterministic fashion. Of course, their intersignal times can vary with a certain average or maybe with a certain variance, but not with absolute deterministic repetition. This aspect has been brought forward a lot of times.
Furthermore, from the beginning, it was known physiologically that the main activity of neurons is of spiking nature, while most of the neural network models that are used today, which are also used in deep learning, are deterministic and use analogue activity.
So how can one go back using rigorous technique and mathematics to the initial reality of the way neurons operate?
This is where a random neural network comes into the picture. The random neural network has two properties. The first is that the spikes represent the signalling activity in the network contrary to the deep learning models. The second is that these spikes occur in a stochastic, random manner. Random doesn’t mean that anything can happen: random means that there is a certain probability distribution but that you cannot exactly predict where each event will occur.
The random neural network incorporates these notions, and it represents a network of arbitrary size, so it may have a hundred, a thousand, a million, or whatever number of nodes. In this random network structure arises a beautiful mathematical property. It is extremely difficult, whether the system is deterministic or stochastic, more random, to analyse it when it’s very large: you always hit the barrier of computation. With the random neural network, there is this beautiful result that I showed back in the late 80s, which says that despite this high complexity, this randomness, this complete interaction between cells if you wait long enough, each cell will have a very simple formula which represents when it will be active and where to be sending spikes. This formula is based on the interaction of the whole other cells with it and on the interaction either through excitation (sending messages or spikes which increase your ability to communicate) or through inhibition (the spikes or messages that reduce your ability to activity).
So, the random network has no bounds on its size and no bounds on interaction. It uses a stochastic model; it has spikes everywhere, but it has this very nice result that says that the probability that the cell is excited is the ratio of the incoming excitatory stream of spikes divided by the arrival rate of the inhibitory stream of spikes, and this is the exact result: this is a theorem, this is not an approximation, this is not heuristic based on the assumptions that are made. This is the exact result.
Furthermore, the results say not just that, they say something even more interesting: the joint probability of solution of a very large system is the multiplication of the individual probabilities of the subsystems. This result simplifies the calculations a lot: it means that you can calculate one and calculate the other separately, and then you put them together, and the result is correct. So this product form result is the second important result.
So, these are the mathematical properties of the network. But because it has such nice mathematical properties, back in the 1990s, I was able to show that learning algorithms for this network were of polynomial time complexity. What does it mean? If you take any problem, the time it takes to calculate something related to that problem can be the search for all possible solutions. This is called exponential time complexity, which is a normal situation: you try everything out, and you get the solution. On the other hand, if you have polynomial time complexity, it means that you do things in time proportional to the size, or to the size to the power of two, or to the size to the power of three. So, in the particular case of the random neural network, the polynomial time complexity is the size to the power of three. It was the first time it was shown that a neural network model had the learning algorithm of polynomial time complexity. If the network was completely connected, it had the possibility of all kinds of communication. Of course, much simpler cases have been shown to have this property, but because of the mathematical properties of this model, we could get a polynomial time complexity for a completely arbitrary, completely complex model.
So with this come further developments. Later, it was shown that this model had the property of universal approximation so that it could approximate continuous and bounded functions. This is very useful. In recent years, we have shown how it can be used for deep learning. Every time we use these beautiful mathematical properties to simplify the computations so that it is much faster to compute with a random neural network than it is with an ordinary deep learning system simply because we use these properties where the individual cells state can be calculated individually from the flows of spikes from other cells. Another nice property that we obtain is that if we need to use a very large network because it has these very nice mathematical properties, rather than say, ‘This network has a million cells, ’ we say, ‘It has n cells where n tends to infinity’ so we can get the asymptotic properties of the network rather than say that n equals a million, a million and one, a million and two etc.
So, this network has a lot of very nice properties from a mathematical perspective. It has a lot of interesting applications.
For instance, one recent application is for the detection of attacks in these cybersecurity situations where you use a random neural network which is trained on the data concerning normal traffic, data concerning attack traffic, data concerning attack traffic mixed with normal traffic and we can then identify inside the traffic which parts are attack traffic using the random neural network.
Another very interesting application is in how we organize routing on the Internet and how we move packets around the Internet. On the Internet, we use the random neural network with reinforcement learning, which is a technique where an advisor to the network observes its actions and says, ‘These are good, these are bad, modify your parameters so that you do the better things’. You continue doing this iteratively, and the learning occurs over a long period of time; the system is continuously learning and improving itself. Of course, if the system’s conditions change, then the learning will adapt to these changing conditions. We are using random neural networks to do these kinds of controls for Internet interconnections.
Another very interesting application is in recognizing tumours in the brain. The tumours are difficult to detect in the brain because brains are non-homogeneous, non-identical organs. The variety between brains forces you to do detection based on very local small sets of properties, and the random neural network is very good at doing that.
As a result, one starts by going back to the foundations about what real neurons are, and we built a mathematical model for spiking networks; we built a mathematical model for real networks which are random. However, we turned this mathematical model, which is inspired by biology, into an engineering tool which has numerous applications in imaging, in the control of internet-type systems, for instance, the detection of cyberattacks and other new areas that may come up.
One of the big challenges when we use all of these techniques is that we realise that our neurons work at one millisecond, a thousandth of a second, while a computer works at a fraction of a millisecond. So the computer is much faster. How am I able to do such complex things with much slower objects? That is an open question. That leads us to the following question: are we using the right models? We have tried to look at this. For instance, we have tried to study random neural networks which don’t just interact in the usual way of excitation and inhibition but also through what we call soma-to-soma interactions where the ‘head’ of the cell is communicating directly with the ‘head’ of the neighbouring cells without passing through the dendrites. So the whole issue of why we are able to do things so fast is absolutely quite mysterious. I think these are interesting questions which will be addressed in the next coming years.
Neuroscientist Karl Friston on different types of brain measurement techniques, people's reaction to emotional...
Neuropsychologist Chris Frith on the experiments with biological motion, emotion recognition, and open questio...
Computer scientist José Luis Vázquez Poletti on supercomputers, weather forecasts, and Amazon