Computer scientist Erol Gelenbe on the communication of neurons, mathematical properties of the random neural networks and how can they be applied

As we go back to the origins of neural networks, we discover that their origins go back to the end of the 19th century. At the end of the 19th century the first measurements are carried out on the electrical signalling activity of neurons on living animals or in the lab with the tissue which is still fresh and which is still able to conduct its normal physiological activities in the lab environment.

So the first measurements showed that the neurons do not have a deterministic behavour: the same neuron with the same input will not react twice or thrice or four times on exactly the same manner. Furthermore, the sequence of signals that a neuron generates to speak to other neurons or to sensory or motor organs do not occur in a deterministic fashion. Of course, their intersignal times can vary with a certain average or maybe with a certain variance but not with absolute deterministic repetition. This aspect has been brought forward a lot of times.

Furthermore, from the beginning it was known physiologically that the main activity of neurons is of spiking nature while most of the neural networks that are used today are deterministic and use analog activity which are also used in deep learning. So how can one go back using rigorous technique and mathematics to the initial reality of the way neurons operate?

This is where random neural network comes into the picture. The random neural network has two properties. The first is that the spikes represent the signalling activity in the network contrary to deep learning usual models. The second is that these spikes occur in a stochastic, random manner. Random doesn’t mean that anything can happen: random means that there is a certain probability distribution but that you cannot exactly predict where each even will occur.

The random neural network incorporates these notions and it represents a network of arbitrary size, so it may have a hundred, a thousand, a million, whatever number of nodes. And in this random network structure arises a beautiful mathematical property. It is extremely difficult, whether the system is deterministic or stochastic, more random, to analyse it when it’s very large: you always hit the barrier of computation. With the random neural network there is this beautiful result which I showed back in the late 80s which says that despite this high complexity, this randomness, this complete interaction between cells, if you wait long enough, each cell will have a very simple formula which represents when it will be active and where to be sending spikes. This formula is based on the interaction of the other cells with it and on the interaction either through excitation (sending messages or spikes which increase your ability to communicate) or through inhibition (the spikes or messages that reduce your ability of activity).

So the random network has no bounds on its size and no bounds on interaction. It uses a stochastic model, it has spikes everywhere but it has this very nice result that says that the probability that the cell is excited is the ratio of the incoming excited stream of spikes divided by the arrival rate of the inhibitory stream of spikes and this is the exact result: this is a theorem, this in not an approximation, this is not heuristic, based on the assumptions that are made. This is the exact result.

Furthermore, the results says not just that, it says something even more interesting: that the joint probability of solution of a very large system is the multiplication of the individual probabilities of the subsystems. This result simplifies the calculations a lot: it means that you can calculate the one and calculate the other separately and then you put them together and the result is correct. So this product form result is the second important result.

So these are the mathematical properties of the network. But because it has such nice mathematical properties back in the 1990s I was able to show that learning algorithms for this network were of polynomial time complexity. What does it mean? If you take any problem, the time it takes to calculate something related to that problem can be the search for all possible solutions. This is called exponential time complexity which is a normal situation: you try everything out and you get the solution. On the other hand, if you have polynomial time complexity, it means that you do things in time proportional to the size, or to the size to the power of two, or to the size to the power of three. So in the particular case of the random neural network the polynomial time complexity is the size to the power of three. It was the first time it was shown that the neural network model had the learning algorithm of polynomial time complexity if the network was completely connected, it has the possibilities of all kinds of communication. Of course, much simpler cases have been shown to have this property but because of the mathematical properties of this model we could get a polynomial time complexity for a completely arbitrary, completely complex model.

So with this come further developments. Later it was shown that this model has the property of universal approximation so that it could approximate continuous and bounded functions. This is very useful. In recent years we have shown how it can be used for deep learning. Every time we use this beautiful mathematical properties to simplify the computations so that it is much faster to compute with a random neural network than it is with an ordinary deep learning system simply because we use these properties where the individual cells state can be calculated individually from the flows of spikes from other cells. Another nice property that we obtain is that if we need to use a very large network because it has these very nice mathematical properties rather than say ‘this network has a million cells’ we say ‘it has n cells where n tends to infinity’ so we can get the asymptotic properties of the network rather than say that n equals a million, a million and one, a million and two etc.

So this network has a lot of very nice properties from a mathematical perspective. It has a lot of interesting applications. For instance, one recent application is for the detection of attacks in these cybersecurity situations where you use a random neural network which is trained on the data concerning normal traffic, data concerning attack traffic, data concerning attack traffic mixed with normal traffic and we can then identify inside the traffic which parts are attack traffic using the random neural network.

Another very interesting application in how we organize routing in the Internet, how we move packets around the Internet. In the Internet we use the random neural network with reinforcement learning which is a technique where an advisor to the network is observing its actions and says, ‘these are good, these are bad, modify your parametres so that you do the better things’. You continue doing this iteratively and the learning occurs during a long period of time, the system is continuously learning and improving itself. But of course, if the system’s conditions change then the learning will adapt to these changing conditions. We are using the random neural networks to do these kinds of controls for Internet interconnections.

Another very interesting application is in recognizing tumours in the brain. The tumours are difficult to detect in the brain because brains are non-homogeneous, non-identical organs. The variety between brains forces you to do detection based on very local small sets of properties, and the random neural network is very good at doing that.

As a result, one starts with going back to the foundations about what we’ve learnt about what neurons are, and we build a mathematical model for spiking networks, we build a mathematical model for real networks which are random. However, we turn this mathematical model which is inspired by the biology onto an engineering tool which has numerous applications in imaging, in the control of internet-type systems for, for instance, the detection of cyberattacks and other new areas that may come up.

One of the big challenges when we use all of these techniques is that we realise that our neurons work at 1 millisecond, a thousandth of a second while a computer works at a fraction of a millisecond. So the computer is much faster. How am I able to do such complex things with much slower objects? That is an open question. That leads us to the following question: are we using the right models? We have tried to look at this. For instance, we have tried to study random neural networks which don’t just interact with the usual way of excitation and inhibition but also through what we call soma to soma interactions where the ‘head’ of the cell is communicating directly with the ‘head’ of the neighboring cells without passing through the dendrites. So the whole issue of why are we able to do things so fast is absolutely quite mysterious. I think these are the interesting questions which will be addressed hopefully in the next coming years.

Read more

Published items

To be published soon